Another question: I need to store airport info in a parquet file and
present it when a user makes a query.

For example:

"airport": {
                                        "code": "TPE",
                                        "name": "Taipei (Taoyuan Intl.)",
                                        "longName": "Taipei, Taiwan
(TPE-Taoyuan Intl.)",
                                        "city": "Taipei",
                                        "localName": "Taoyuan Intl.",
                                        "airportCityState": "Taipei, Taiwan"


Is it best practice to store just the coce "TPE" and then look up the name
"Taipei (Taoyuan Intl.)" from a relational database? Any alternatives?

On Sun, Apr 30, 2017 at 6:34 PM, Jörn Franke <jornfra...@gmail.com> wrote:

> Depends on your queries, the data structure etc. generally flat is better,
> but if your query filter is on the highest level then you may have better
> performance with a nested structure, but it really depends
>
> > On 30. Apr 2017, at 10:19, Zeming Yu <zemin...@gmail.com> wrote:
> >
> > Hi,
> >
> > We're building a parquet based data lake. I was under the impression
> that flat files are more efficient than deeply nested files (say 3 or 4
> levels down). Is that correct?
> >
> > Thanks,
> > Zeming
>

Reply via email to