from:"Yogesh Vyas"

streaming of binary files in PySpark

2017-05-22 Thread Yogesh Vyas

Hi, I want to use Spark Streaming to read the binary files from HDFS. In the documentation, it is mentioned to use binaryRecordStream(directory, recordLength). But I didn't understand what does the record length means?? Does it means the size of the binary file or something else? Regards, Yogesh

pandas DF Dstream to Spark DF

2017-04-09 Thread Yogesh Vyas

Hi, I am writing a pyspark streaming job in which i am returning a pandas data frame as DStream. Now I wanted to save this DStream dataframe to parquet file. How to do that? I am trying to convert it to spark data frame but I am getting multiple errors. Please suggest me how to do that. Regards,

pandas DF DStream to Spark dataframe

2017-04-09 Thread Yogesh Vyas

Hi, I am writing a pyspark streaming job in which i am returning a pandas data frame as DStream. Now I wanted to save this DStream dataframe to parquet file. How to do that? I am trying to convert it to spark data frame but I am getting multiple errors. Please suggest me how to do that. Regards

use UTF-16 decode in pyspark streaming

2017-04-05 Thread Yogesh Vyas

Hi, I am trying to decode the binary data using UTF-16 decode in Kafka consumer using spark streaming. But it is giving error: TypeError: 'str' object is not callable I am doing it in following way: kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", {topic: 1},valueDecoder=

reading binary file in spark-kafka streaming

2017-04-04 Thread Yogesh Vyas

Hi, I am having a binary file which I try to read in Kafka Producer and send to message queue. This I read in the Spark-Kafka consumer as streaming job. But it is giving me following error: UnicodeDecodeError: 'utf8' codec can't decode byte 0xa9 in position 112: invalid start byte Can anyone ple

read binary file in PySpark

2017-04-01 Thread Yogesh Vyas

Hi, I am trying to read binary file in PySpark using API binaryRecords(path, recordLength), but it is giving all values as ['\x00', '\x00', '\x00', '\x00',]. But when I am trying to read the same file using binaryFiles(0, it is giving me correct rdd, but in form of key-value pair. The value i

Filtering in SparkR

2016-10-03 Thread Yogesh Vyas

Hi, I have two SparkDataFrames, df1 and df2. There schemas are as follows: df1=>SparkDataFrame[id:double, c1:string, c2:string] df2=>SparkDataFrame[id:double, c3:string, c4:string] I want to filter out rows from df1 where df1$id does not match df2$id I tried some expression: filter(df1,!(df1$id

Fwd: filtering in SparkR

2016-10-02 Thread Yogesh Vyas

Hi, I have two SparkDataFrames, df1 and df2. There schemas are as follows: df1=>SparkDataFrame[id:double, c1:string, c2:string] df2=>SparkDataFrame[id:double, c3:string, c4:string] I want to filter out rows from df1 where df1$id does not match df2$id I tried some expression: filter(df1,!(df1$id

filtering in SparkR

2016-10-02 Thread Yogesh Vyas

Hi, I have two SparkDataFrames, df1 and df2. There schemas are as follows: df1=>SparkDataFrame[id:double, c1:string, c2:string] df2=>SparkDataFrame[id:double, c3:string, c4:string] I want to filter out rows from df1 where df1$id does not match df2$id I tried some expression: filter(df1,!(df1$id

Disable logger in SparkR

2016-08-22 Thread Yogesh Vyas

Hi, Is there any way of disabling the logging on console in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

UDF in SparkR

2016-08-16 Thread Yogesh Vyas

Hi, Is there is any way of using UDF in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

XLConnect in SparkR

2016-07-20 Thread Yogesh Vyas

Hi, I am trying to load and read excel sheets from HDFS in sparkR using XLConnect package. Can anyone help me in finding out how to read xls files from HDFS in sparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@s

Handle empty kafka in Spark Streaming

2016-06-15 Thread Yogesh Vyas

Hi, Does anyone knows how to handle empty Kafka while Spark Streaming job is running ? Regards, Yogesh - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Re: Handling Empty RDD

2016-05-22 Thread Yogesh Vyas

: > You mean when rdd.isEmpty() returned false, saveAsTextFile still produced > empty file ? > > Can you show code snippet that demonstrates this ? > > Cheers > > On Sun, May 22, 2016 at 5:17 AM, Yogesh Vyas wrote: >> >> Hi, >> I am reading files using textFil

Handling Empty RDD

2016-05-22 Thread Yogesh Vyas

Hi, I am reading files using textFileStream, performing some action onto it and then saving it to HDFS using saveAsTextFile. But whenever there is no file to read, Spark will write and empty RDD( [] ) to HDFS. So, how to handle the empty RDD. I checked rdd.isEmpty() and rdd.count>0, but both of th

Filter out the elements from xml file in Spark

2016-05-19 Thread Yogesh Vyas

Hi, I had xml files which I am reading through textFileStream, and then filtering out the required elements using traditional conditions and loops. I would like to know if there is any specific packages or functions provided in spark to perform operations on RDD of xml? Regards, Yogesh -

File not found exception while reading from folder using textFileStream

2016-05-18 Thread Yogesh Vyas

Hi, I am trying to read the files in a streaming way using Spark Streaming. For this I am copying files from my local folder to the source folder from where spark reads the file. After reading and printing some of the files, it gives the following error: Caused by: org.apache.hadoop.ipc.RemoteExce

Save DataFrame to Hive Table

2016-02-29 Thread Yogesh Vyas

Hi, I have created a DataFrame in Spark, now I want to save it directly into the hive table. How to do it.? I have created the hive table using following hiveContext: HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc.sc()); hiveContext.sql("CREATE TABLE IF NOT EXISTS

Getting java.lang.IllegalArgumentException: requirement failed while calling Sparks MLLIB StreamingKMeans from java application

2016-02-15 Thread Yogesh Vyas

Hi, I am trying to run a KMeansStreaming from the Java application, but it gives the following error: "Getting java.lang.IllegalArgumentException: requirement failed while calling Sparks MLLIB StreamingKMeans from java application" Below is my code: JavaDStream v = trainingData.map(new Function

Visualization of KMeans cluster in Spark

2016-01-28 Thread Yogesh Vyas

Hi, Is there any way to visualizing the KMeans clusters in spark? Can we connect Plotly with Apache Spark in Java? Thanks, Yogesh - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-

Re: NoSuchMethodError

2015-11-15 Thread Yogesh Vyas

; DataFrame df = sqlContext.read().json(pathToJSONFile); df.show(); On Mon, Nov 16, 2015 at 12:48 PM, Fengdong Yu wrote: > what’s your SQL? > > > > >> On Nov 16, 2015, at 3:02 PM, Yogesh Vyas wrote: >> >> Hi, >> >> While I am trying t

NoSuchMethodError

2015-11-15 Thread Yogesh Vyas

Hi, While I am trying to read a json file using SQLContext, i get the following error: Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.SQLContext.(Lorg/apache/spark/api/java/JavaSparkContext;)V at com.honeywell.test.testhive.HiveSpark.main(HiveSpark.java:15)

Re: JMX with Spark

2015-11-05 Thread Yogesh Vyas

https://spark.apache.org/docs/latest/monitoring.html > > Romi Kuntsman, Big Data Engineer > http://www.totango.com > > On Thu, Nov 5, 2015 at 2:08 PM, Yogesh Vyas wrote: >> >> Hi, >> How we can use JMX and JCo

JMX with Spark

2015-11-05 Thread Yogesh Vyas

Hi, How we can use JMX and JConsole to monitor our Spark applications? - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Get list of Strings from its Previous State

2015-10-15 Thread Yogesh Vyas

Hi, I am new to Spark and was trying to do some experiments with it. I had a JavaPairDStream> RDD. I want to get the list of string from its previous state. For that I use updateStateByKey function as follows: final Function2, Optional>, Optional>> updateFunc = new Function2, Optional>, Op

Fwd: Get the previous state string

2015-10-15 Thread Yogesh Vyas

-- Forwarded message -- From: Yogesh Vyas Date: Thu, Oct 15, 2015 at 6:08 PM Subject: Get the previous state string To: user@spark.apache.org Hi, I am new to Spark and was trying to do some experiments with it. I had a JavaPairDStream> RDD. I want to get the list of string f

streaming of binary files in PySpark

pandas DF Dstream to Spark DF

pandas DF DStream to Spark dataframe

use UTF-16 decode in pyspark streaming

reading binary file in spark-kafka streaming

read binary file in PySpark

Filtering in SparkR

Fwd: filtering in SparkR

filtering in SparkR

Disable logger in SparkR

UDF in SparkR

XLConnect in SparkR

Handle empty kafka in Spark Streaming

Re: Handling Empty RDD

Handling Empty RDD

Filter out the elements from xml file in Spark

File not found exception while reading from folder using textFileStream

Save DataFrame to Hive Table

Getting java.lang.IllegalArgumentException: requirement failed while calling Sparks MLLIB StreamingKMeans from java application

Visualization of KMeans cluster in Spark

Re: NoSuchMethodError

NoSuchMethodError

Re: JMX with Spark

JMX with Spark

Get list of Strings from its Previous State

Fwd: Get the previous state string

26 matches

Site Navigation

Mail list logo

Footer information