Hi Frank, thanks al lot for your response, this is a very helpful!
Actually I'm try to figure out does the current spark version supports Repetition levels (https://blog.twitter.com/2013/dremel-made-simple-with-parquet) but now it looks good to me. It is very hard to find some good things about that. Now I found this as well: https://git-wip-us.apache.org/repos/asf?p=spark.git;a=blob;f=sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTestData.scala;h=1dc58633a2a68cd910c1bab01c3d5ee1eb4f8709;hb=f479cf37 I wasn't sure of that because nested data can be many different things! If it works with SQL, to find the firstRepeatedid or secoundRepeatedid would be awesome. But if it only works with kind of map/reduce job than it also good. The most important thing is to filter the first or secound repeated value as fast as possible and in combination as well. I start now to play with this things to get the best search results! Me schema looks like this: val nestedSchema = """message nestedRowSchema { int32 firstRepeatedid; repeated group level1 { int64 secoundRepeatedid; repeated group level2 { int64 value1; int32 value2; } } } """ Best, Matthes -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-use-Parquet-with-Dremel-encoding-tp15186p15239.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org