‘Die Young’: UE4 Open World creation and optimization

06

Die Young production started with an interesting idea from an artistic point of view. It was a crazy challenge for us: we had to develop an open world set in a mediterranean island using an engine that was really lacking about advanced open world optimization techniques, and we were just 6, and only 4 with previous experience in game development.

The first six months have been really hard, the engine got a lot of updates in that period but still wasn’t enough powerful to run our game.

So we decided to spend the first year studying the engine and working on a prototype, that we presented later at Gamescon in 2015.

 

What’s the problem dude? It’s Unreal Engine 4!

Yeah, and it’s fantastic! But it’s an undeniable fact that it was lacking a lot of rendering optimization and that was the bigger problem we had since we didn’t have a graphic programmer or engine programmer, and we couldn’t reach our wanted FPS with the base engine given by Epic Games (our goal was 60 fps on a GTX 970 with max settings rendering a realistic world).

Many rendering optimizations update came during 2015, together with them Epic Games released Kite Demo which ran 30 FPS on my GTX 970: half of our goal.

We continued the development of the prototype: we had a landscape with some foliage on it, and the only reason it could go 50-60 fps on a GTX 970 was that Tommaso Magherini, our Level Designer, smartly built a level made of “corridors” which occluded almost everything of the rest of the map, but the production didn’t wanted this: the goal was a game environment made of large views and fields full of vegetation.

 

After months of prototyping, studying, updates and optimizations we started building the game environment (January 2016).

 

Defining the costs and the workflow pipeline

With the few data I had on game design (which wasn’t finished and it would have been modified a lot of times during the production), and the rendering times which used to change from UE version by version, I estimated a data budget for art production that could match production requests and our possibilities.

The first things I started working on were an art workflow, far rendering distance (I couldn’t use an heavy fog, since the set was a mediterranean island on hot summer), foliage painting rules and a shading system to be cheap but solid at the same time, trying to keep video memory low as much as possible and fps high as possible.

 

Art workflow

20170623215815_1

Since we were three artist working on a such big project (a number reduced to two later, plus a Character Artist.) I choosed with no doubts to use Substance for our texturing pipeline. I setup a library to share materials and textures I made with Designer with my other team mates, so they could use it on Painter.

After defining content rules for artists in Unreal Engine, I also created a library of generic tiling textures in Unreal Engine 4 to fastly texture a no-hero prop using the materials I setup later.

The generic master material I setup had some common but useful functions for artists in addition to simply use Painter exported textures in a basic material.

In case of a fast needed prop with just tiling textures, an artist could create a new instance of the master I created, choose a texture from the library (or the Painter exported ones) and take advange of its parameters and functions like adding mesh normals/ambient occlusion baked from an high poly, or adding a Z projected texture (moss or dust for example), using world aligned on the textures, using a detailed texturing on a top texturing layer, add a fresnel and so on..

I created various type of this master for different types of materials and we finally got different ones to cover several texturing techniques:

  • Base_BCR: non metallic materials. 1 x 32-bit RGBA texture for Base Color(RGB) and Roughness(A), 1 x 24-bit RGB texture for the normal map.
  • Base_BCRM: metallic materials, 1 x 32-bit RGBA texture for Base Color and Ambient Occlusion, 1 x 32-bit RGBA texture for roughness(R), metallic(B) and an optional mask in G channel, 1 x 24-bit RGB texture for the normal map.
  • Base_Masked: non metallic materials which uses RGB masks and tiling textures, 1 32-bit RGBA texture for the masks and maximum 5 textures from the library for the texturing (1 for each mask + 1 base texture).

An example of a static mesh textured following this shader rules, a huge bridge made by tiling textures and masks.

04.PNG

(Made by Claudio Rapuano, Foliage Artist at Indiegala)

  • Base_Cloth: as BCR but with different shading setup to be a cloth material.
  • Base_SSS: as BCR but with different shading setup to be an SSS material, plus occasionally a texture for SSS masking.

And I can’t forget the Master_Foliage I realized to handle all the foliage shading.

 

After noticing that we were rendering tons of tris in the scene, I and my colleague Claudio started generating LODs for the heavier foliage and architectural assets, using Simplygon, which avoided a lot of production time waste.

(Speedtree trees lod generation have been a bit tricky, since Simplygon LODs didn’t work always proper on them and generating them on Maya meant losing the ST wind information, which was really important for us, so in the end we choosed the Speedtree ones plus an imported BB which was used within a shader to turn it based on character location.)

World composition

Along with the Lead Programmer Matteo Battolla, I created the base of a persistent world with streamed sub-levels, choosing how to divide every asset related to design and art, leaving the logic assets to my colleague.

Everything had to be loaded smartly around the character avoiding wasting of video memory and CPU usage, so we handled every sub-level streaming distance to load it just where needed.

We divided each type of game elements in different sub-levels, some examples:

  • Landscape: the different tiles of the landscape.
  • Gameplay levels: they contained part of the game logic blueprints, the assets which had to have their position saved during level unloading and movable gameplay meshes. An example of a couple of levels in world composition:01.PNG
  • Static lightened scenes: I used this levels to create the environments that needed static lighting, so I could handle their level streaming and keep out of the video memory lightmaps when I didn’t need them. At the same time I could work in parallel with the design guys. Sometimes I used to divide a static lighting level in multiple levels if the map size exceeded 150 mb to avoid CPU peaks on loading. An example:02

And so on…

 

 

 

Rendering distance and optimization: how to survive the monster

 

Even if we had defined a lot of rules for art and design, the game was changing continuosly, adding design elements and even new levels in shipped areas, so we were always careful about the game GPU and CPU usage, trying to reduce it every time it became worst, and understanding what was causing that worsening.

To reach an high average of fps I used the first player view to reach the highest fps on larger rendering distances. From here you can see the Coastal tower level on the left, the final tower on the right, the manor farm (now behind the trees in the center-right) and we have a lot of pixels rendering foliage, static meshes, sea (turning left and right), sky and landscape.

20170623220409_1

Landscape Lods was our first boost in terms of framerate.

I also managed to have the trees we placed with the foliage in the exactly world position they were in the non-lod tile: this feature is not provided by Epic Games, so I created it for our game.

Every time the designers completed (with the Level Artist) a tile of the landscape, I was ready to generate a LOD for it. Unfortunately, the first thing I discovered was that the foliage wasn’t present in the LOD, and I couldn’t have the landscape popping trees ruining everything. So after some days of trial and error (because basically I didn’t have any idea on how to copy the foliage from my level in the Main World to the LOD level, neither the guys from Epic had it) I magically find a solution, that I can’t share with you for the moment but I will in the future if this feature won’t be added to the engine.

Since that most of the levels were towers or big houses and exploration was one of the design core elements, I had to find a way to render them at anytime, also at 2 km of distances if needed: obviously not loading the level itself with all the gameplay logic and the assets

 

I used HLOD to render just the exterior of a level at far distances, but beyond certain distances I had to unload the level and this led me to lose the HLOD from the scene.

So I designed a three-layers rendering of the levels (let’s take for instance a “tower” level):

  1. 100-150 meters from level bound: tower level is loaded + HLOD. Internal meshes are culled.
  2. 300-400 meters from level bound: tower level is unloaded, HLOD mesh is placed on tile level, with a min draw distance near to the tower level streaming distance, to pop just one moment before its unloading.
  3. 500 meters from level bound: landscape tile level switch to landscape tile lod level, so HLOD mesh placed on it wasn’t visible anymore. Placing the HLOD mesh also on the lod level solved this, allowing me to render it from very large distances.

 

The next thing to analyze was the foliage: since artists respected the rules and data budgets I gave for its generation, foliage wasn’t a problem and didn’t require a lot of optimization during the development. We just used to tweak transparency when it was breaking the overdraw and draw distance.

 

Further on we noticed that every layer painted on the landscape had a substantial costs, so I defined a max number of layer per landscape cluster, which were analyzable through the “Layer Density” and “Layer Usage” visualizationd modes.

 

Static meshes issue was in their quantity and rendering distance: every landscape tile level had 1000-2000 static meshes on it, and we couldn’t handle one by one from the outliner to decide where and how to render them, it would have required a lot of production time, so along with the Lead Programmer I built two different blueprints to handle them:

  1. CullDistance Assigner: a blueprint which assigns the draw distance to every asset in the level based on datatables which contain all static meshes of the game as a key, having as value a draw distance float based on its bound diameter.
  2. CullDistance checker: it was for a hybrid manual use. It returns the list of the static mesh actors in the scene with a draw distance of 0 (unlimited): clicking on the inspector near to the name, the users was taken to the mesh position and set a draw distance.

 

Then I moved on textures streaming: the ListStreamingTextures command returned an  unordered and confusing list, so I asked my colleague Programmer Vladimir to edit the engine and make the command return a CSV, then I wrote a python script to order this list based on a texture filter and size and finally I had the possibility to analyze all the textures during the game in a better way, avoiding wastes and resizing when needed.

Another tool I often used during the development, and that has been very useful to analyze shipped builds was Intel GPA and its Frame Analyzer, that allowed me to detect the heavier elements and profiling them deeper than I could in UE4.

 

Lighting

The last topic of this article is about the lighting technique I used for “Die Young”.

The game environment had to be illuminated and colored in two different ways:

  • a dynamic lighting cycle from morning to sunset with an intense and saturated coloring palette to bring out mediterranean colors in the exterior;
  • underground places with a dark and creepy color palette to scary the player, with horror elements: a complete different feeling.

08

The first solution I chose for the exterior was completely dynamic, with distance field shadowing. We discarded distance field AO because it was too much costly in terms of rendering and production times.

 

For the interiors, I initially tried some dynamic lights, but it was a disaster in terms of quality/performance ratio, expecially with a dynamic skylight which illuminated everything (since we didn’t use DFAO).

The problem itself wasn’t the creation of an illuminated environment, but how to delete the skylight contribution in those areas, and after some tests (postprocessing, a blueprint to scale skylight intensity, DFAO again…) I decided to use a stationary skylight and bake a shadow map in those areas. This was the cheap solution to keep the game running smoothly and with nice graphics.

Once I got a good shadow map, I illuminated each area with movable and static lights, keeping map sizes below 150MB to avoid loading peaks.

 

This solution also implied adding some rules for the designers and artists, like avoid landscape in those areas (really bad performance with movable lights, I solved this using lighting channels later, and bad quality with static lighting) and to use bigger modules as possible.

Where they couldn’t use bigger modules, I used to merge everything possible to avoid lightmaps for hundred of modules and save hours of building time.

 

Since we divided gameplay logic from level design, lighting re-build is only necessary if the designer have to move some modules cointained in SL levels.

In this way designers can edit the levels using the movable gameplay stuffs (movable crates, AI, falling platforms, traps..) which is illuminated by the indirect lighting cache I built along the level.

 

To reach a good ratio between quality and performance, I set rules for lightmap size of every asset in the game, and avoided using shadows on static lights, since the resolution wouldn’t be enough to have a smoothed shadow baked on a lightmap (for shadows in the interior I usually made use of movable lights).

As I said, each sub-level size is below 150 MB to avoid loading peaks, and frame rate in this areas is totally ok (70+ fps with a GTX 970 on epic settings.)

This solution was then applied to each sub-level with static lighting in the game.

 

The final result

 

After two years of hard working we finally got a good product with a nice compromise between graphics and performance.

We squeezed the engine to optimize everything as much as we could, and more optimizations are on the road map ready to be introduced.

The game is an action parkour, and to keep it over 60 fps we discarded a lot of cool graphics feature that UE4 has to make your game graphics really cool and unique, but you know, nothing is free.

In the end I can say that it was an amazing experience and I have to make compliments to all my team mates. You rock!

Die Young is available on Steam

http://store.steampowered.com/app/433170/Die_Young/

a48e8a23ea7f1350b53a2600b643b478f98ee95c

 

 

 

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s