Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread jane thorpe
Did you know that simple demo program of reading characters from file didn't work ? Who wrote that simple hello world type little program ? jane thorpe janethor...@aol.com -Original Message- From: jane thorpe To: somplasticllc ; user Sent: Fri, 3 Apr 2020 2:44 Subject: Re: HDF

Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread Som Lima
> > jane thorpe > janethor...@aol.com > > > -Original Message- > From: jane thorpe > To: somplasticllc ; user > Sent: Fri, 3 Apr 2020 2:44 > Subject: Re: HDFS file hdfs:// > 127.0.0.1:9000/hdfs/spark/examples/README.txt > > > Thanks darling > >

Fwd: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-06 Thread jane thorpe
: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt Thanks darling I tried this and worked hdfs getconf -confKey fs.defaultFS hdfs://localhost:9000 scala> :paste // Entering paste mode (ctrl-D to finish) val textFile = sc.textFile("hdfs://127.0.0.1:9000/hdfs/spark/

Re: HDFS file hdfs://127.0.0.1:9000/hdfs/spark/examples/README.txt

2020-04-02 Thread jane thorpe
0.1:9000/hdfs/spark/examples/README.txt MapPartitionsRDD[91] at textFile at :27 counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[94] at reduceByKey at :30 scala> :quit jane thorpe janethor...@aol.com -Original Message- From: Som Lima CC: user Sent: Tue, 31 Mar 2020

Re: HDFS file

2020-03-31 Thread Som Lima
Hi Jane Try this example https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/HdfsWordCount.scala Som On Tue, 31 Mar 2020, 21:34 jane thorpe, wrote: > hi, > > Are there setup instructions on the website for >

HDFS file

2020-03-31 Thread jane thorpe
hi, Are there setup instructions on the website for spark-3.0.0-preview2-bin-hadoop2.7I can run same program for hdfs format val textFile = sc.textFile("hdfs://...") val counts = textFile.flatMap(line => line.split(" ")) .map(word => (word, 1)) .reduceByKey(_ +

hdfs file partition

2018-04-19 Thread 崔苗
Hi, when I create a dataset by reading a json file from hdfs ,I found the partition number of the dataset not equals to the file blocks, so what define the partition number of the dataset when I read file from hdfs ?

Change the owner of hdfs file being saved

2017-11-02 Thread Sunita Arvind
Hello Experts, I am required to use a specific user id to save files on a remote hdfs cluster. Remote in the sense, spark jobs run on EMR and write to a CDH cluster. Hence I cannot change the hdfs-site.xml etc to point to the destination cluster. As a result I am using webhdfs to save the files

how to give hdfs file path as argument to spark-submit

2017-02-17 Thread nancy henry
Hi All, object Step1 { def main(args: Array[String]) = { val sparkConf = new SparkConf().setAppName("my-app") val sc = new SparkContext(sparkConf) val hiveSqlContext: HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-22 Thread yeshwanth kumar
Hi Ayan, , thanks for the explanation, I am aware of compression codecs. How does locality level set? Is it done by Spark or yarn? Please let me know, Thanks, Yesh On Nov 22, 2016 5:13 PM, "ayan guha" wrote: Hi RACK_LOCAL = Task running on the same rack but not on

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-22 Thread ayan guha
Hi RACK_LOCAL = Task running on the same rack but not on the same node where data is NODE_LOCAL = task and data is co-located. Probably you were looking for this one? GZIP - Read is through GZIP codec, but because it is non-splittable, so you can have atmost 1 task reading a gzip file. Now, the

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-22 Thread yeshwanth kumar
Hi Ayan, we have default rack topology. -Yeshwanth Can you Imagine what I would do if I could do all I can - Art of War On Tue, Nov 22, 2016 at 6:37 AM, ayan guha wrote: > Because snappy is not splittable, so single task makes sense. > > Are sure about rack topology?

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-22 Thread ayan guha
Because snappy is not splittable, so single task makes sense. Are sure about rack topology? Ie 225 is in a different rack than 227 or 228? What does your topology file says? On 22 Nov 2016 10:14, "yeshwanth kumar" wrote: > Thanks for your reply, > > i can definitely

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-21 Thread yeshwanth kumar
Thanks for your reply, i can definitely change the underlying compression format. but i am trying to understand the Locality Level, why executor ran on a different node, where the blocks are not present, when Locality Level is RACK_LOCAL can you shed some light on this. Thanks, Yesh

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-21 Thread Jörn Franke
Use as a format orc, parquet or avro because they support any compression type with parallel processing. Alternatively split your file in several smaller ones. Another alternative would be bzip2 (but slower in general) or Lzo (usually it is not included by default in many distributions). > On

Re: RDD Partitions on HDFS file in Hive on Spark Query

2016-11-21 Thread Aniket Bhatnagar
Try changing compression to bzip2 or lzo. For reference - http://comphadoop.weebly.com Thanks, Aniket On Mon, Nov 21, 2016, 10:18 PM yeshwanth kumar wrote: > Hi, > > we are running Hive on Spark, we have an external table over snappy > compressed csv file of size 917.4 M

RDD Partitions on HDFS file in Hive on Spark Query

2016-11-21 Thread yeshwanth kumar
Hi, we are running Hive on Spark, we have an external table over snappy compressed csv file of size 917.4 M HDFS block size is set to 256 MB as per my Understanding, if i run a query over that external table , it should launch 4 tasks. one for each block. but i am seeing one executor and one

get hdfs file path in spark

2016-07-25 Thread Yang Cao
Hi, To be new here, I hope to get assistant from you guys. I wonder whether I have some elegant way to get some directory under some path. For example, I have a path like on hfs /a/b/c/d/e/f, and I am given a/b/c, is there any straight forward way to get the path /a/b/c/d/e . I think I can do

Re: Need Streaming output to single HDFS File

2016-04-12 Thread Sachin Aggarwal
hey u can use repartition and set it to 1 as in this example unionDStream.foreachRDD((rdd, time) => { val count = rdd.count() println("count" + count) if (count > 0) { print("rdd partition=" + rdd.partitions.length) val outputRDD =

Need Streaming output to single HDFS File

2016-04-12 Thread Priya Ch
Hi All, I am working with Kafka, Spark Streaming and I want to write the streaming output to a single file. dstream.saveAsTexFiles() is creating files in different folders. Is there a way to write to a single folder ? or else if written to different folders, how do I merge them ? Thanks, Padma

Using Spark to retrieve a HDFS file protected by Kerberos

2016-03-23 Thread Nkechi Achara
I am having issues setting up my spark environment to read from a kerberized HDFS file location. At the moment I have tried to do the following: def ugiDoAs[T](ugi: Option[UserGroupInformation])(code: => T) = ugi match { case None => code case Some(u) => u

Using Spark to retrieve a HDFS file protected by Kerberos

2016-03-22 Thread Nkechi Achara
I am having issues setting up my spark environment to read from a kerberized HDFS file location. At the moment I have tried to do the following: def ugiDoAs[T](ugi: Option[UserGroupInformation])(code: => T) = ugi match { case None => code case Some(u) => u

Re: Error :Type mismatch error when passing hdfs file path to spark-csv load method

2016-02-21 Thread Jonathan Kelly
On the line preceding the one that the compiler is complaining about (which doesn't actually have a problem in itself), you declare df as "df"+fileName, making it a string. Then you try to assign a DataFrame to df, but it's already a string. I don't quite understand your intent with that previous

Error :Type mismatch error when passing hdfs file path to spark-csv load method

2016-02-21 Thread Divya Gehlot
Hi, I am trying to dynamically create Dataframe by reading subdirectories under parent directory My code looks like > import org.apache.spark._ > import org.apache.spark.sql._ > val hadoopConf = new org.apache.hadoop.conf.Configuration() > val hdfsConn = org.apache.hadoop.fs.FileSystem.get(new >

how to save Matrix type result to hdfs file using java

2016-01-24 Thread zhangjp
Hi all, I have calculated a covariance?? it's a Matrix type ,now i want to save the result to hdfs, how can i do it? thx

show to save Matrix type result to hdfs file using java

2016-01-24 Thread zhangjp
Hi all, I have calculated a covariance?? it's a Matrix type ,now i want to save the result to hdfs, how can i do it? thx

Re: how to save Matrix type result to hdfs file using java

2016-01-24 Thread Yanbo Liang
Matrix can be save as column of type MatrixUDT.

?????? how to save Matrix type result to hdfs file using java

2016-01-24 Thread zhangjp
Hi Yanbo, I'm using java language and the environment is spark 1.4.1. Can u tell me how to do it more detail , the follows is my code, how can i save the cov to hdfs file ? " RowMatrix mat = new RowMatrix(rows.rdd()); Matrix cov = mat.computeCovar

Read HDFS file from an executor(closure)

2016-01-12 Thread Udit Mehta
Hi, Is there a way to read a text file from inside a spark executor? I need to do this for an streaming application where we need to read a file(whose contents would change) from a closure. I cannot use the "sc.textFile" method since spark context is not serializable. I also cannot read a file

Re: copy/mv hdfs file to another directory by spark program

2016-01-04 Thread Don Drake
You will need to use the HDFS API to do that. Try something like: val conf = sc.hadoopConfiguration val fs = org.apache.hadoop.fs.FileSystem.get(conf) fs.rename(new org.apache.hadoop.fs.Path("/path/on/hdfs/file.txt"), new org.apache.hadoop.fs.Path("/path/on/hdfs/other/file.txt")) Full API for

copy/mv hdfs file to another directory by spark program

2016-01-04 Thread Zhiliang Zhu
For some file on hdfs, it is necessary to copy/move it to some another specific hdfs  directory, and the directory name would keep unchanged.Just need finish it in spark program, but not hdfs commands.Is there any codes, it seems not to be done by searching spark doc ... Thanks in advance! 

Re: copy/mv hdfs file to another directory by spark program

2016-01-04 Thread ayan guha
My guess is No, unless you are okay to read the data and write it back again. On Tue, Jan 5, 2016 at 2:07 PM, Zhiliang Zhu wrote: > > For some file on hdfs, it is necessary to copy/move it to some another > specific hdfs directory, and the directory name would keep

Logging spark output to hdfs file

2015-12-08 Thread sunil m
Hi! I configured log4j.properties file in conf folder of spark with following values... log4j.appender.file.File=hdfs:// I expected all log files to log output to the file in HDFS. Instead files are created locally. Has anybody tried logging to HDFS by configuring log4j.properties? Warm

Re: Logging spark output to hdfs file

2015-12-08 Thread Jörn Franke
This would require a special HDFS log4j appender. Alternatively try the flume log4j appender > On 08 Dec 2015, at 13:00, sunil m <260885smanik...@gmail.com> wrote: > > Hi! > I configured log4j.properties file in conf folder of spark with following > values... > >

spark streaming HDFS file issue

2015-06-29 Thread ravi tella
directory after the program started running but never got any output. I even passed a non existent directory as the input to the textFileStream but the application did not throw any error and ran just like it did when i had the right directory. I am able to access the same HDFS file system from non

Re: Appending to an hdfs file

2015-01-29 Thread Matan Safriel
at 10:39 PM, Matan Safriel dev.ma...@gmail.com wrote: Hi, Is it possible to append to an existing (hdfs) file, through some Spark action? Should there be any reason not to use a hadoop append api within a Spark job? Thanks, Matan

Appending to an hdfs file

2015-01-28 Thread Matan Safriel
Hi, Is it possible to append to an existing (hdfs) file, through some Spark action? Should there be any reason not to use a hadoop append api within a Spark job? Thanks, Matan

Re: Appending to an hdfs file

2015-01-28 Thread Sean Owen
is that RDDs are immutable and so their input and output is naturally immutable, not mutable. On Wed, Jan 28, 2015 at 10:39 PM, Matan Safriel dev.ma...@gmail.com wrote: Hi, Is it possible to append to an existing (hdfs) file, through some Spark action? Should there be any reason not to use a hadoop

Is there a way to delete hdfs file/directory using spark API?

2015-01-21 Thread LinQili
Hi, allI wonder how to delete hdfs file/directory using spark API?

Re: Is there a way to delete hdfs file/directory using spark API?

2015-01-21 Thread Akhil Das
wonder how to delete hdfs file/directory using spark API?

Re: Is there a way to delete hdfs file/directory using spark API?

2015-01-21 Thread Raghavendra Pandey
wrote: Hi, all I wonder how to delete hdfs file/directory using spark API?

Re: pyspark and hdfs file name

2014-11-14 Thread Oleg Ruchovets
Hi Devies. Thank you for the quick answer. I have a code like this: sc = SparkContext(appName=TAD) lines = sc.textFile(sys.argv[1], 1) result = lines.map(doSplit).groupByKey().map(lambda (k,vc): traffic_process_model(k,vc)) result.saveAsTextFile(sys.argv[2]) Can you please give short

Read a HDFS file from Spark using HDFS API

2014-11-14 Thread rapelly kartheek
Hi, I am trying to read a HDFS file from Spark scheduler code. I could find how to write hdfs read/writes in java. But I need to access hdfs from spark using scala. Can someone please help me in this regard.

Re: Read a HDFS file from Spark using HDFS API

2014-11-14 Thread Akhil Das
like this? val file = sc.textFile(hdfs://localhost:9000/sigmoid/input.txt) Thanks Best Regards On Fri, Nov 14, 2014 at 9:02 PM, rapelly kartheek kartheek.m...@gmail.com wrote: Hi, I am trying to read a HDFS file from Spark scheduler code. I could find how to write hdfs read/writes in java

Re: Read a HDFS file from Spark using HDFS API

2014-11-14 Thread Akhil Das
trying to read a HDFS file from Spark scheduler code. I could find how to write hdfs read/writes in java. But I need to access hdfs from spark using scala. Can someone please help me in this regard.

Re: Read a HDFS file from Spark using HDFS API

2014-11-14 Thread Akhil Das
...@gmail.com] *Sent:* Friday, November 14, 2014 9:42 AM *To:* Akhil Das; user@spark.apache.org *Subject:* Re: Read a HDFS file from Spark using HDFS API No. I am not accessing hdfs from either shell or a spark application. I want to access from spark Scheduler code. I face an error when I

Re: Read a HDFS file from Spark using HDFS API

2014-11-14 Thread rapelly kartheek
@spark.apache.org *Subject:* Re: Read a HDFS file from Spark using HDFS API No. I am not accessing hdfs from either shell or a spark application. I want to access from spark Scheduler code. I face an error when I use sc.textFile() as SparkContext wouldn't have been created yet. So, error says: sc

Re: Read a HDFS file from Spark using HDFS API

2014-11-14 Thread rapelly kartheek
*From:* rapelly kartheek [mailto:kartheek.m...@gmail.com] *Sent:* Friday, November 14, 2014 9:42 AM *To:* Akhil Das; user@spark.apache.org *Subject:* Re: Read a HDFS file from Spark using HDFS API No. I am not accessing hdfs from either shell or a spark application. I want to access from

pyspark and hdfs file name

2014-11-13 Thread Oleg Ruchovets
Hi , I am running pyspark job. I need serialize final result to *hdfs in binary files* and having ability to give a *name for output files*. I found this post: http://stackoverflow.com/questions/25293962/specifying-the-output-file-name-in-apache-spark but it explains how to do it using scala.

Re: pyspark and hdfs file name

2014-11-13 Thread Davies Liu
One option maybe call HDFS tools or client to rename them after saveAsXXXFile(). On Thu, Nov 13, 2014 at 9:39 PM, Oleg Ruchovets oruchov...@gmail.com wrote: Hi , I am running pyspark job. I need serialize final result to hdfs in binary files and having ability to give a name for output

Read a HDFS file from Spark source code

2014-11-11 Thread rapelly kartheek
Hi I am trying to access a file in HDFS from spark source code. Basically, I am tweaking the spark source code. I need to access a file in HDFS from the source code of the spark. I am really not understanding how to go about doing this. Can someone please help me out in this regard. Thank you!!

Re: Read a HDFS file from Spark source code

2014-11-11 Thread Samarth Mailinglist
Instead of a file path, use a HDFS URI. For example: (In Python) data = sc.textFile(hdfs://localhost/user/someuser/data) ​ On Wed, Nov 12, 2014 at 10:12 AM, rapelly kartheek kartheek.m...@gmail.com wrote: Hi I am trying to access a file in HDFS from spark source code. Basically, I am

Re: Read a HDFS file from Spark source code

2014-11-11 Thread rapelly kartheek
Hi Sean, I was following this link; http://mund-consulting.com/Blog/Posts/file-operations-in-HDFS-using-java.aspx But, I was facing FileSystem ambiguity error. I really don't have any idea as to how to go about doing this. Can you please help me how to start off with this? On Wed, Nov 12, 2014

Re: access hdfs file name in map()

2014-08-01 Thread Roberto Torella
Hi Simon, I'm trying to do the same but I'm quite lost. How did you do that? (Too direct? :) Thanks and ciao, r- -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/access-hdfs-file-name-in-map-tp6551p11160.html Sent from the Apache Spark User List mailing

Re: access hdfs file name in map()

2014-08-01 Thread Xu (Simon) Chen
-user-list.1001560.n3.nabble.com/access-hdfs-file-name-in-map-tp6551p11160.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: access hdfs file name in map()

2014-06-04 Thread Xu (Simon) Chen
an integer index and an iterator to another iterator. How does that help with retrieving the hdfs file name? I am obviously missing some context.. Thanks. On May 30, 2014 1:28 AM, Aaron Davidson ilike...@gmail.com wrote: Currently there is not a way to do this using textFile(). However, you

Re: access hdfs file name in map()

2014-06-03 Thread Xu (Simon) Chen
I don't quite get it.. mapPartitionWithIndex takes a function that maps an integer index and an iterator to another iterator. How does that help with retrieving the hdfs file name? I am obviously missing some context.. Thanks. On May 30, 2014 1:28 AM, Aaron Davidson ilike...@gmail.com wrote

access hdfs file name in map()

2014-05-29 Thread Xu (Simon) Chen
Hello, A quick question about using spark to parse text-format CSV files stored on hdfs. I have something very simple: sc.textFile(hdfs://test/path/*).map(line = line.split(,)).map(p = (XXX, p[0], p[2])) Here, I want to replace XXX with a string, which is the current csv filename for the line.

Re: access hdfs file name in map()

2014-05-29 Thread Aaron Davidson
Currently there is not a way to do this using textFile(). However, you could pretty straightforwardly define your own subclass of HadoopRDD [1] in order to get access to this information (likely using mapPartitionsWithIndex to look up the InputSplit for a particular partition). Note that

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-04-16 Thread Arpit Tak
dependency to 0.18.0 in spark's pom.xml. Rebuilding the JAR with this configuration solves the issue. -Anant -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-reading-HDFS-file-using-spark-0-9-0-hadoop-2-2-0-incompatible-protobuf-2-5-and-2-4-1

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-04-15 Thread anant
pom.xml. Rebuilding the JAR with this configuration solves the issue. -Anant -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-reading-HDFS-file-using-spark-0-9-0-hadoop-2-2-0-incompatible-protobuf-2-5-and-2-4-1-tp2158p4286.html Sent from the Apache Spark User

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-04-15 Thread giive chen
://apache-spark-user-list.1001560.n3.nabble.com/Error-reading-HDFS-file-using-spark-0-9-0-hadoop-2-2-0-incompatible-protobuf-2-5-and-2-4-1-tp2158p3770.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-04-04 Thread Prasad
Hi Wisely, Could you please post your pom.xml here. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-reading-HDFS-file-using-spark-0-9-0-hadoop-2-2-0-incompatible-protobuf-2-5-and-2-4-1-tp2158p3770.html Sent from the Apache Spark User List

Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-26 Thread qingyang li
Egor, i encounter the same problem which you have asked in this thread: http://mail-archives.apache.org/mod_mbox/spark-user/201402.mbox/%3CCAMrx5DwJVJS0g_FE7_2qwMu4Xf0y5VfV=tlyauv2kh5v4k6...@mail.gmail.com%3E have you fixed this problem? i am using shark to read a table which i have created on

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-25 Thread Patrick Wendell
Starting with Spark 0.9 the protobuf dependency we use is shaded and cannot interfere with other protobuf libaries including those in Hadoop. Not sure what's going on in this case. Would someone who is having this problem post exactly how they are building spark? - Patrick On Fri, Mar 21, 2014

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-21 Thread Aureliano Buendia
On Tue, Mar 18, 2014 at 12:56 PM, Ognen Duzlevski og...@plainvanillagames.com wrote: On 3/18/14, 4:49 AM, dmpou...@gmail.com wrote: On Sunday, 2 March 2014 19:19:49 UTC+2, Aureliano Buendia wrote: Is there a reason for spark using the older akka? On Sun, Mar 2, 2014 at 1:53 PM, 1esha

Re: Unable to read HDFS file -- SimpleApp.java

2014-03-19 Thread Prasad
Check this thread out, http://apache-spark-user-list.1001560.n3.nabble.com/Error-reading-HDFS-file-using-spark-0-9-0-hadoop-2-2-0-incompatible-protobuf-2-5-and-2-4-1-tp2158p2807.html -- you have to remove conflicting akka and protbuf versions. Thanks Prasad. -- View this message in context

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-18 Thread dmpour23
.1001560.n3.nabble.com/Error-reading-HDFS-file-using-spark-0-9-0-hadoop-2-2-0-incompatible-protobuf-2-5-and-2-4-1-tp2158p2217.html Sent from the Apache Spark User List mailing list archive at Nabble.com. Is the solution to exclude the 2.4.*. dependency on protobuf or will thi produce more

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-03-18 Thread Ognen Duzlevski
On 3/18/14, 4:49 AM, dmpou...@gmail.com wrote: On Sunday, 2 March 2014 19:19:49 UTC+2, Aureliano Buendia wrote: Is there a reason for spark using the older akka? On Sun, Mar 2, 2014 at 1:53 PM, 1esha alexey.r...@gmail.com wrote: The problem is in akka remote. It contains files compiled

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-02-28 Thread Egor Pahomov
error while reading HDFS file using spark 0.9.0 -- i am running on hadoop 2.2.0 . When i look thru, i find that i have both 2.4.1 and 2.5 and some blogs suggest that there is some incompatability issues betwen 2.4.1 and 2.5 hduser@prasadHdp1:~/spark-0.9.0-incubating$ find ~/ -name protobuf

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-02-28 Thread Egor Pahomov
, Ognen Duzlevski og...@plainvanillagames.com wrote: A stupid question, by the way, you did compile Spark with Hadoop 2.2.0 support? Ognen On 2/28/14, 10:51 AM, Prasad wrote: Hi I am getting the protobuf error while reading HDFS file using spark 0.9.0 -- i am running on hadoop 2.2.0