One day, a crazy idea came to my head. Suppose you are writing an application. A web application. That exposes an endpoint… Sounds interesting, right? Hang on, here is the best part: suppose it accepts a data! Woohoo, that’s surprising! And a big deal, indeed!

Now, assuming you decided to base on the Play Framework, how could you handle incoming data? What options do you have? With this post, I would like to make an overview of techniques and capabilities of dealing with data received in a request body. Let’s ride.

The official documentation

The documentation of the Framework provides an overview of what is available out of the box. You can find there a bunch of custom parsers examples as well.

Parser’s building blocks

Everything starts with the play.api.mvc.BodyParser interface. It is the most basic interface all parsers in Play extends. And you have to use it to create a custom one – directly or by extending an existing implementation.

The interface defines a single method only, i.e. apply method. It returns with an instance of play.libs.streams.Accumulator interface. A request’s body content is processed in a stream fashion, and the instance holds a result of it.

What is cool, is the possibility to create the accumulator from any Akka Stream Sink instance. So you can use FileIO API to download a multipart file or send a notification with Reactive Kafka. Set of sinks available is quite significant. Just check Akka Streams and Alpakka projects for implementations.

When writing a custom parser, you can utilize any object that is available for injection. The useful ones can be:
play.api.http.HttpConfiguration providing all HTTP settings;
play.http.HttpErrorHandler helpful when dealing with both, client and server, errors;
play.api.mvc.PlayBodyParsers holding all built-in parsers from Play.

Working with parsing results

The outcome of parsing a data can be twofold: it is an instance of a class holding the body’s content or information about an error.

In the former situation, i.e. when everything went smoothly, we can manipulate the accumulated value with Accumulators map and mapFuture methods.

The latter is in the form of an instance of Result class with a status code and an optional message explaining the failure’s reason. You can recover from erroneous situations with two methods defined on Accumulator: recover and recoverWith.

Accumulator has yet another interesting method, i.e. Accumulator.through(akka.streams.javadsl.Flow). You can use it if you want to apply some flow to a data before passing it to the accumulator. Where can this be useful? You can compute a digest of a ByteString stream or make some validation on the fly as data arrives.

Using parsers

In my Todo application, I created some time ago, you can find a REST endpoint accepting some request with a body. The body contains data about a single todo item in a JSON format. A method handling this data I annotated with BodyParser.of(BodyParser.Json.class). What does it do? It is responsible for matching a value of the Content-Type header (expected application/json) and verifying whether a received request has a body. These are two things I do not have to think about and simply focus on a business logic:

@BodyParser.Of(BodyParser.Json.class)
public Result addItem() {
    JsonNode json = request().body().asJson();
    TodoItemRequest request = fromJson(json, TodoItemRequest.class);
    // do something with the data ...
}

What will happen if there is no parser specified on the top of the method and the received request has a body? The framework will use play.mvc.BodyParser.Default parser that applies concrete (built-in) implementation based on a mime type found in the Content-Type header.

A custom implementation

Ok, but what to do if the existing parsers cannot handle our use case? Well, write a custom one!

I have played with three simple implementations handling CSV and JSON/XML and Protobuf protocol.

CSV content validated

Basic stuff first. The documentation shows implementation of a parser working with CSV format. However, it does not apply any validation during a parsing process. Thus, I created a parser based on OpenCSV library.

Comparing to the one from the documentation, I have based OpenCsv parser on play.mvc.BufferingBodyParser. Thanks to this, I can apply the parser to a whole body, buffered in memory. In case of dealing with larger requests’ contents, processing data in a stream fashion – as in the documentation – is the way to go.

If you use the buffered parser as the base class, you need to provide three things: maximum size of available memory for the buffer, an error handler and error message prefix. I have used a value for the memory setting from play.api.http.ParserConfiguration and injected the framework’s default error handler as it is shown below.

@Inject
protected OpenCsvBodyParser(ParserConfiguration config, HttpErrorHandler errorHandler) {
    super(config.maxMemoryBuffer(), errorHandler, ERR_MSG_PREFIX);
}

When you base on BufferingBodyParser, the apply(..) method is responsible for buffering a data up to a given memory limit and passing it to the parse(Http.RequestHeader, ByteString) method which must be implemented by you. It is the place where all parsing magic happens.

If there is any parsing error, I throw an exception informing about the issue, that is translated into a response with the help of error handler injected in the constructor. In the example, I check a Content-Type header value, look for a CSV header and ensure all due dates have the expected format. I have used OpenCsv capabilities for the last two checks.

If data is just ok, the output of the parser is a domain’s request object. To get the object from a request you need to call in your controller class a code like this one:

final List list = request().body().as(List.class);

May I accept JSON/XML only?

The idea behind the next parser comes from the fact there is no possibility to define a couple of concrete mime types a given endpoint supports.

Ok, you are right – I could use Default parser. However, this one iterates over predefined parsers only (it checks Content-Type header against values supported by the framework).

So, what would I like to achieve? A possibility to set specific mime types a given endpoint can use – let’s say just two: JSON and XML.

How can we solve such situation? By implementing a custom parser composed of two other ones.

For the Todo application, I have created JsonOrXmlBodyParser parser that supports two data formats mentioned above. It uses predefined parsers, i.e. TolerantJson and TolerantXml handling JSON and XML formats respectively. Both are injected in the constructor.

A result of parsing is of play.libs.F.Either type, which is a functional programming helper class. It is a container holding data parsed correctly (as Either.Right) or a Result instance with error information (as Either.Left).

The parser’s output must be a concrete type. Here, it is a domain’s request instance holding data necessary to create a new todo item, built from JSON or XML structure. Thanks to this I do not have to bother with these formats in a controller class when working on request (creation of a response is another story):

TodoItemRequest req = request().body().as(TodoItemRequest.class);

Protobuf protocol on the horizon

Writing a parser for a CSV, XML or JSON is ok. However, it is somehow, well, boring. I started looking for some other format that would be not-so-common, thus more interesting.

I decided on Google’s Protocol Buffers. According to Google Trends, it is not as trendy as the formats mentioned above, and one could say it is more exotic than the three above.

Protobuf is a binary format. It handles data in a more efficient (regarding a size of a body) way than JSON or XML. Thanks to sbt-protobuf plugin I could effortlessly use the Protocol in my project. After providing a path to files containing Protobuf messages, the plugin generates sources from them during the project’s build. With the code, I could start my play with Protobuf.

Naive approach

I wrote a parser for the action creating a single Todo item. Since the protocol provides serialization mechanism, it is straightforward to parse received data. Most of the logic deals with mime-type checks and error handling. Transforming request’s data is as easy as calling the following two lines:

CreateItemRequest req = ProtobufTodoItem.CreateItemRequest.parseFrom(bytes.toArray());
new TodoItemRequest(req.getName(), asLocalDateTime(req.getDueDate())) domainReq = new TodoItemRequest(req.getName(), asLocalDateTime(req.getDueDate()))

If you know the fundamental concepts of data parsing in Play Framework, you can quickly add custom logic. Following the above way of proceeding, I could implement other parsers for remaining actions handling requests data.

However, TodoItemRequestProtobufParser is not the best solution, I think. While I can use the efficient protocol, I also need to create a specialized parser for every action with such approach. Maintaining a parser for every endpoint is not the coolest thing. We can use Protobuf more efficiently.

Generic version

Play Framework offers a mechanism of actions. What’s more, you can compose them into more sophisticated ones. And you can mark the composition with annotations.

What I have done is a custom ProtobufParser annotation accepting an expected type of a Protobuf request. It composes a method annotated with it with a logic provided by ProtobufAction class. Its responsibility is parsing received request into a type provided in the annotation.

Here is how you can use it in a controller:

@ProtobufParser(ProtobufTodoItem.FetchItemsRequest.class)
public Result findItems() {
    ProtobufTodoItem.FetchItemsRequest request = getRequest(ProtobufTodoItem.FetchItemsRequest.class);
    ...
}

This solution is a generic one. You can annotate a controller method specifying what kind of Protobuf request you are expecting there and parse request’s data. Without touching a standard parsing mechanism!

Is the solution perfect? Not at all. I think it is not even complete. I have not used an extension registry, and you can find a usage of a nasty Java Reflection in such a small class. But it works 😉

Summary

So, what are my thoughts about request data parsing in Play Framework? I have two main points.

First of all, I have found no possibility to define a global custom parser for the whole application. With this feature, the code would look cleaner, with fewer annotations and imports required. Comparing to capabilities of the Spring Framework, this is something I am missing badly.

The second thing is parsing data as a stream. It can be handy with earlier detection of a Content-Type error, immediate signaling of an exception during data processing, etc. Moreover, we can handle a massive amount of data with no need to load it to memory at once. In combination with Alpakka’s connectors, we get a powerful tool.

In overall, the parsing framework is lightweight. The standard implementations provide samples of how to handle data, and we have to do nothing if an application uses the basic stuff only. On the other hand, creating a custom parser is straightforward, and we have much freedom implementing it. If not the missing possibility of registering a specific parser globally, this part of the Framework would be really good.

Photo by Christin Hume on Unsplash


0 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.