Re: Is it possible to use Parquet with Dremel encoding

matthes Fri, 26 Sep 2014 08:49:06 -0700

Hi Frank,

thanks al lot for your response, this is a very helpful!


Actually I'm try to figure out does the current spark version supports
Repetition levels
(https://blog.twitter.com/2013/dremel-made-simple-with-parquet) but now it
looks good to me.
It is very hard to find some good things about that. Now I found this as
well: 
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=blob;f=sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTestData.scala;h=1dc58633a2a68cd910c1bab01c3d5ee1eb4f8709;hb=f479cf37

I wasn't sure of that because nested data can be many different things!
If it works with SQL, to find the firstRepeatedid or secoundRepeatedid would
be awesome. But if it only works with kind of map/reduce job than it also
good. The most important thing is to filter the first or secound  repeated
value as fast as possible and in combination as well.
I start now to play with this things to get the best search results!

Me schema looks like this:

val nestedSchema =
    """message nestedRowSchema 
{
                  int32 firstRepeatedid;
                  repeated group level1
                  {
                        int64 secoundRepeatedid;
                        repeated group level2 
                      {
                        int64   value1;
                        int32   value2;
                      }
                  }
        }
    """

Best,
Matthes



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-use-Parquet-with-Dremel-encoding-tp15186p15239.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Is it possible to use Parquet with Dremel encoding

Reply via email to