Another question: I need to store airport info in a parquet file and present it when a user makes a query.
For example: "airport": { "code": "TPE", "name": "Taipei (Taoyuan Intl.)", "longName": "Taipei, Taiwan (TPE-Taoyuan Intl.)", "city": "Taipei", "localName": "Taoyuan Intl.", "airportCityState": "Taipei, Taiwan" Is it best practice to store just the coce "TPE" and then look up the name "Taipei (Taoyuan Intl.)" from a relational database? Any alternatives? On Sun, Apr 30, 2017 at 6:34 PM, Jörn Franke <jornfra...@gmail.com> wrote: > Depends on your queries, the data structure etc. generally flat is better, > but if your query filter is on the highest level then you may have better > performance with a nested structure, but it really depends > > > On 30. Apr 2017, at 10:19, Zeming Yu <zemin...@gmail.com> wrote: > > > > Hi, > > > > We're building a parquet based data lake. I was under the impression > that flat files are more efficient than deeply nested files (say 3 or 4 > levels down). Is that correct? > > > > Thanks, > > Zeming >