subject:"Read parquet folders recursively"

Re: Read parquet folders recursively

2015-03-12 Thread Akhil Das

With fileStream you are free to plugin any InputFormat, in your case, you can easily plugin ParquetInputFormat. Here' some parquet hadoop examples https://github.com/Parquet/parquet-mr/tree/master/parquet-hadoop/src/main/java/parquet/hadoop/example . Thanks Best Regards On Thu, Mar 12, 2015 at

Re: Read parquet folders recursively

2015-03-12 Thread Masf

Hi. Thanks for your answers, but, to read parquet files is necessary to use parquetFile method in org.apache.spark.sql.SQLContext, is it true? How can I combine your solution with the called to this method? Thanks!! Regards On Thu, Mar 12, 2015 at 8:34 AM, Yijie Shen henry.yijies...@gmail.com

Re: Read parquet folders recursively

2015-03-12 Thread Akhil Das

Hi We have a custom build to read directories recursively, Currently we use it with fileStream like: val lines = ssc.fileStream[LongWritable, Text, TextInputFormat](/datadumps/, (t: Path) = true, true, *true*) Making the 4th argument true to read recursively. You could give it a try

Re: Read parquet folders recursively

2015-03-12 Thread Yijie Shen

org.apache.spark.deploy.SparkHadoopUtil has a method: /** * Get [[FileStatus]] objects for all leaf children (files) under the given base path. If the * given path points to a file, return a single-element collection containing [[FileStatus]] of * that file. */ def

Read parquet folders recursively

2015-03-11 Thread Masf

Hi all Is it possible to read recursively folders to read parquet files? Thanks. -- Saludos. Miguel Ángel

Re: Read parquet folders recursively

Re: Read parquet folders recursively

Re: Read parquet folders recursively

Re: Read parquet folders recursively

Read parquet folders recursively

5 matches

Site Navigation

Mail list logo

Footer information