Hi,
I want to use Spark Streaming to read the binary files from HDFS. In the
documentation, it is mentioned to use binaryRecordStream(directory,
recordLength).
But I didn't understand what does the record length means?? Does it means
the size of the binary file or something else?
Regards,
Yogesh
Hi,
I am writing a pyspark streaming job in which i am returning a pandas data
frame as DStream. Now I wanted to save this DStream dataframe to parquet
file. How to do that?
I am trying to convert it to spark data frame but I am getting multiple
errors. Please suggest me how to do that.
Regards,
Hi,
I am writing a pyspark streaming job in which i am returning a pandas data
frame as DStream. Now I wanted to save this DStream dataframe to parquet
file. How to do that?
I am trying to convert it to spark data frame but I am getting multiple
errors. Please suggest me how to do that.
Regards
Hi,
I am trying to decode the binary data using UTF-16 decode in Kafka consumer
using spark streaming. But it is giving error:
TypeError: 'str' object is not callable
I am doing it in following way:
kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer",
{topic: 1},valueDecoder=
Hi,
I am having a binary file which I try to read in Kafka Producer and send to
message queue. This I read in the Spark-Kafka consumer as streaming job.
But it is giving me following error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa9 in position 112:
invalid start byte
Can anyone ple
Hi,
I am trying to read binary file in PySpark using API binaryRecords(path,
recordLength), but it is giving all values as ['\x00', '\x00', '\x00',
'\x00',].
But when I am trying to read the same file using binaryFiles(0, it is
giving me correct rdd, but in form of key-value pair. The value i
Hi,
I have two SparkDataFrames, df1 and df2.
There schemas are as follows:
df1=>SparkDataFrame[id:double, c1:string, c2:string]
df2=>SparkDataFrame[id:double, c3:string, c4:string]
I want to filter out rows from df1 where df1$id does not match df2$id
I tried some expression: filter(df1,!(df1$id
Hi,
I have two SparkDataFrames, df1 and df2.
There schemas are as follows:
df1=>SparkDataFrame[id:double, c1:string, c2:string]
df2=>SparkDataFrame[id:double, c3:string, c4:string]
I want to filter out rows from df1 where df1$id does not match df2$id
I tried some expression: filter(df1,!(df1$id
Hi,
I have two SparkDataFrames, df1 and df2.
There schemas are as follows:
df1=>SparkDataFrame[id:double, c1:string, c2:string]
df2=>SparkDataFrame[id:double, c3:string, c4:string]
I want to filter out rows from df1 where df1$id does not match df2$id
I tried some expression: filter(df1,!(df1$id
Hi,
Is there any way of disabling the logging on console in SparkR ?
Regards,
Yogesh
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi,
Is there is any way of using UDF in SparkR ?
Regards,
Yogesh
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi,
I am trying to load and read excel sheets from HDFS in sparkR using
XLConnect package.
Can anyone help me in finding out how to read xls files from HDFS in sparkR ?
Regards,
Yogesh
-
To unsubscribe e-mail: user-unsubscr...@s
Hi,
Does anyone knows how to handle empty Kafka while Spark Streaming job
is running ?
Regards,
Yogesh
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
:
> You mean when rdd.isEmpty() returned false, saveAsTextFile still produced
> empty file ?
>
> Can you show code snippet that demonstrates this ?
>
> Cheers
>
> On Sun, May 22, 2016 at 5:17 AM, Yogesh Vyas wrote:
>>
>> Hi,
>> I am reading files using textFil
Hi,
I am reading files using textFileStream, performing some action onto
it and then saving it to HDFS using saveAsTextFile.
But whenever there is no file to read, Spark will write and empty RDD(
[] ) to HDFS.
So, how to handle the empty RDD.
I checked rdd.isEmpty() and rdd.count>0, but both of th
Hi,
I had xml files which I am reading through textFileStream, and then
filtering out the required elements using traditional conditions and
loops. I would like to know if there is any specific packages or
functions provided in spark to perform operations on RDD of xml?
Regards,
Yogesh
-
Hi,
I am trying to read the files in a streaming way using Spark
Streaming. For this I am copying files from my local folder to the
source folder from where spark reads the file.
After reading and printing some of the files, it gives the following error:
Caused by: org.apache.hadoop.ipc.RemoteExce
Hi,
I have created a DataFrame in Spark, now I want to save it directly
into the hive table. How to do it.?
I have created the hive table using following hiveContext:
HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());
hiveContext.sql("CREATE TABLE IF NOT EXISTS
Hi,
I am trying to run a KMeansStreaming from the Java application, but
it gives the following error:
"Getting java.lang.IllegalArgumentException: requirement failed while
calling Sparks MLLIB StreamingKMeans from java application"
Below is my code:
JavaDStream v = trainingData.map(new Function
Hi,
Is there any way to visualizing the KMeans clusters in spark?
Can we connect Plotly with Apache Spark in Java?
Thanks,
Yogesh
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-
;
DataFrame df = sqlContext.read().json(pathToJSONFile);
df.show();
On Mon, Nov 16, 2015 at 12:48 PM, Fengdong Yu wrote:
> what’s your SQL?
>
>
>
>
>> On Nov 16, 2015, at 3:02 PM, Yogesh Vyas wrote:
>>
>> Hi,
>>
>> While I am trying t
Hi,
While I am trying to read a json file using SQLContext, i get the
following error:
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.spark.sql.SQLContext.(Lorg/apache/spark/api/java/JavaSparkContext;)V
at com.honeywell.test.testhive.HiveSpark.main(HiveSpark.java:15)
https://spark.apache.org/docs/latest/monitoring.html
>
> Romi Kuntsman, Big Data Engineer
> http://www.totango.com
>
> On Thu, Nov 5, 2015 at 2:08 PM, Yogesh Vyas wrote:
>>
>> Hi,
>> How we can use JMX and JCo
Hi,
How we can use JMX and JConsole to monitor our Spark applications?
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
Hi,
I am new to Spark and was trying to do some experiments with it.
I had a JavaPairDStream> RDD.
I want to get the list of string from its previous state. For that I
use updateStateByKey function as follows:
final Function2, Optional>,
Optional>> updateFunc =
new Function2, Optional>,
Op
-- Forwarded message --
From: Yogesh Vyas
Date: Thu, Oct 15, 2015 at 6:08 PM
Subject: Get the previous state string
To: user@spark.apache.org
Hi,
I am new to Spark and was trying to do some experiments with it.
I had a JavaPairDStream> RDD.
I want to get the list of string f
26 matches
Mail list logo