Thanks Michael,
yes, unpacking the nested fields into individual arrays within the inventory collection would have been my preferred choice but this cannot be achieved directly. Together with Alexander Friedrich from Software AG we worked out these 3 options:
Option 1: Known number of array entries
Just as you stated - if the (max) number of array elements is known we can simply export specific indexes into named columns e.g. src.additional[0].type
→ additional_type_0
and so on.
This is definitely the easiest way to do this.
Option 2: Flatten entire array
It is possible to “flatten” the array during offloading which will produce multiple rows in the offloaded table. The target column type will be of type STRUCT
. For example: If FLATTEN(src.additional)
is offloaded to an additional
column, I can later access individual values within this column like src.additional.type
.
This has two disadvantages: A) multiple rows are being created which creates a bit of redundancy and B) all fields from the array are being offloaded. It is not possible to just select specific nested fields and they cannot be offloaded to individual columns either.
Option 3: Export the array as ARRAY/LIST and deal with it later
If the array fragment is offloaded as-is it will be a single column of type LIST
. This list and the elements within (of type STRUCT
) can be access it queries, though. We can use this to create a Virtual Data Set (VDS) which holds just specific nested fields of the array in a nice, normalized way.
In this (phony) example I would create an inventory_additional
VDS which is based on the offloaded inventory
collection like so:
SELECT o.id, o.lastUpdated, o.add.type, o.add.number
FROM (
SELECT src.id, src.lastUpdated, FLATTEN(src.additional) as add
FROM inventory as src
) as o
Because the array/list is _flatten_ed, this will result in multiple rows in the inventory_additional
view. This view can then be used in all kinds of queries.
I chose option 3 in the end, as in my actual data the array has many more nested fields (and I only needed a couple) and I could not assume a max number of array elements. The flattened VDS is pretty nice, getting the whole thing closer to a normalized database design.
Cheers! Christoph