Hello,
I'm building a spark app required to read large amounts of log files from
s3. I'm doing so in the code by constructing the file list, and passing it
to the context as following:
val myRDD = sc.textFile(s3n://mybucket/file1, s3n://mybucket/file2, ... ,
s3n://mybucket/fileN)
When running
What about running, say, 2 executors per machine, each of which thinks
it should use all cores?
You can also multi-thread your map function manually, directly, within
your code, with careful use of a java.util.concurrent.Executor
On Wed, Nov 26, 2014 at 6:57 AM, yotto yotto.k...@autodesk.com
Hi,
I use the following code to read in data and extract the unique users using
Spark SQL. The data is 1.2 TB and I am running this on a cluster with 3 TB
memory. It appears that there is enough memory, but the program just freezes
after sometime where it maps the rdd to the case class Play. (If
Hi,
I've confirmed that the latest Spark with either Hive 0.12 or 0.13.1 fails
optimizing auto broadcast join in my query. I have a query that joins a
huge fact table with 15 tiny dimension tables.
I'm currently using an older version of Spark which was built on Oct. 12.
Anyone else has met
you can try creating hadoop Configuration and set s3 configuration i.e.
access keys etc.
Now, for reading files from s3 use newAPIHadoopFile and pass the config
object here along with key, value classes.
-
Lalit Yadav
la...@sigmoidanalytics.com
--
View this message in context:
Hi Mukesh,
Once you create a streming job, a DAG is created which contains your job
plan i.e. all map transformation and all action operations to be performed
on each batch of streaming application.
So, once your job is started, the input dstream take the data input from
specified source and all
You can call Scala code from Java, even when it involves overloaded
operators, since they are also just methods with names like $plus and
$times. In this case, it's not quite feasible since the Scala API is
complex and would end up forcing you to manually supply some other
implementation details
Thanks Lalit; Setting the access + secret keys in the configuration works
even when calling sc.textFile. Is there a way to select which hadoop s3
native filesystem implementation would be used at runtime using the hadoop
configuration?
Thanks,
Tomer
On Wed, Nov 26, 2014 at 11:08 AM, lalit1303
Hi,
I'm trying to start the thrift server but failing:
Exception in thread main java.lang.NoClassDefFoundError:
org/apache/tez/dag/api/SessionNotRunning
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:353)
at
Is there some place I can read more about it ? I can't find any reference.
I actully want to flatten these structures and not return them from the UDF.
Thanks,
Daniel
On Tue, Nov 25, 2014 at 8:44 PM, Michael Armbrust mich...@databricks.com
wrote:
Maps should just be scala maps, structs are
many thanks for adding this so quickly.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/why-MatrixFactorizationModel-private-tp19763p19855.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi.
Is there a way to submit spark job on Hadoop-YARN cluster from java code.
-Naveen
Hi Sean,
The values brzPi and brzTheta are of the form
breeze.linalg.DenseVectorDouble. So would I have to convert them back to
simple vectors and use a library to perform addition/multiplication?
If yes, can you please point me to the conversion logic and vector operation
library for Java?
The above issue happens while trying to do the below activity on JavaRDD
(calling take() on rdd)
JavaRDDString loadedRDD = sc.textFile(...);
String[] tokens = loadedRDD.take(1).get(0).split(,);
--
View this message in context:
Hi all,
I am getting familiarized with Mllib and a thing I noticed is that running the
MovieLensALS
example on the movieLens dataset for increasing number of iterations does not
decrease the
rmse.
The results for 0.6% training set and 0.4% test are below. For training set to
0.8%, the
I'll take a wild guess that you have mismatching versions of Spark at
play. Your cluster has one build and you're accidentally including
another version.
I think this code path has changed recently (
copying user group - I keep replying directly vs reply all :)
On Wed, Nov 26, 2014 at 2:03 PM, Nick Pentreath nick.pentre...@gmail.com
wrote:
ALS will be guaranteed to decrease the squared error (therefore RMSE) in
each iteration, on the *training* set.
This does not hold for the *test* set
Hi guys,
When we are using Kinesis with 1 shard then it works fine. But when we use more
that 1 then it falls into an infinite loop and no data is processed by the
spark streaming. In the kinesis dynamo DB, I can see that it keeps increasing
the leaseCounter. But it do start processing.
I am
Hi Sean
Thanks for reply,
We upgraded our spark cluster from 1.1.0 to 1.2.0.
And we also thought that this issue might be due to mis matching spark jar
versions.
But we double checked and re installed our app completely in a new system
with spark-1.2.0 distro, but still no result.
Facing the same
Hi,
I am running Spark in the stand alone mode.
1) I have a file of 286MB in HDFS (block size is 64MB) and so is split into
5 blocks. When I have the file in HDFS, 5 tasks are generated and so 5
files in the output. My understanding is that there will be a separate
partition for each block and
try repartition of rdd to say 2x number of cores available before
saveAsTextFile
-
Lalit Yadav
la...@sigmoidanalytics.com
--
View this message in context:
Hi,
I have a short question regarding the compute() of an SchemaRDD.
For SchemaRDD the actual queryExecution seems to be triggered via
collect(), while the compute triggers only the compute() of the parent and
copies the data (Please correct me if I am wrong!).
Is this compute() triggered at all
Hello,
I have a large data calculation in Spark, distributed across serveral
nodes. In the end, I want to write to a single output file.
For this I do:
output.coalesce(1, false).saveAsTextFile(filename).
What happens is all the data from the workers flows to a single worker, and
that one
Hello,
I work with Graphx. When I call graph.partitionBy(..) nothing happens,
because, as I understood, that all transformation are lazy and partitionBy is
built using transformations.
Is there way how to force spark to actually execute this transformation and not
use any action?
--
Hi,
can't you just use graph.partitionBy(..).collect()?
Cheers,
Joerg
On Wed, Nov 26, 2014 at 2:25 PM, Hlib Mykhailenko hlib.mykhaile...@inria.fr
wrote:
Hello,
I work with Graphx. When I call graph.partitionBy(..) nothing happens,
because, as I understood, that all transformation are lazy
Once again, the error even with the training dataset increases. The results are:
Running 1 iterations
For 1 iter.: Test RMSE = 1.2447121194304893 Training RMSE =
1.2394166987104076 (34.751317636 s).
Running 5 iterations
For 5 iter.: Test RMSE = 1.3253957117600659 Training RMSE =
How are you computing RMSE?
and how are you training the model -- not with trainImplicit right?
I wonder if you are somehow optimizing something besides RMSE.
On Wed, Nov 26, 2014 at 2:36 PM, Kostas Kloudas kklou...@gmail.com wrote:
Once again, the error even with the training dataset increases.
For the training I am using the code in the MovieLensALS example with
trainImplicit set to false
and for the training RMSE I use the
val rmseTr = computeRmse(model, training, params.implicitPrefs).
The computeRmse() method is provided in the MovieLensALS class.
Thanks a lot,
Kostas
On
Hi,
We have been running Spark 1.0.2 with Mesos 0.20.1 in fine grained mode and for
the most part it has been working well.
We have been using mesos://zk://server1:2181,server2:2181,server3:2181/mesos as
the spark master URL and this works great to get the Mesos leader.
Unfortunately, this
SparkContext.textfile() cannot load file using UNC path on windows
I run the following on Windows XP
val conf = new
SparkConf().setAppName(testproj1.ClassificationEngine).setMaster(local)
val sc = new SparkContext(conf)
Hi Judy,
Are you somehow modifying Spark's classpath to include jars from
Hadoop and Hive that you have running on the machine? The issue seems
to be that you are somehow including a version of Hadoop that
references the original guava package. The Hadoop that is bundled in
the Spark jars should
Hi guys
I started playing with spark streaming, and I came up with an idea that I
wonder if it a valid idea.
Building a jetty input stream, which is basically a jetty server that each http
request it gets it streams .
What do you think of this idea?
Thanks, Guy
Just to double check - I looked at our own assembly jar and I
confirmed that our Hadoop configuration class does use the correctly
shaded version of Guava. My best guess here is that somehow a separate
Hadoop library is ending up on the classpath, possible because Spark
put it there somehow.
tar
How about?
- Create a SparkContext
- setMaster as *yarn-cluster*
- Create a JavaSparkContext with the above SparkContext
And that will submit it to the yarn cluster.
Thanks
Best Regards
On Wed, Nov 26, 2014 at 4:20 PM, Naveen Kumar Pokala
npok...@spcapitaliq.com wrote:
Hi.
Is there a
Liang,
Can you do me a favor and run the predictOnvalues on a sample test data, and
see if it is working on your end, it is not working for me. It keeps
predicting 0.
My code:
val conf = new
SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression)
val ssc = new
I have it working without any issues (tried with 5 shrads), except my java
version was 1.7.
Here's the piece of code that i used.
System.setProperty(AWS_ACCESS_KEY_ID,
this.kConf.getOrElse(access_key, ))
System.setProperty(AWS_SECRET_KEY, this.kConf.getOrElse(secret,
)) val streamName
Hi,
When I'm trying to build spark assembly to include the dependencies related
to thrift server, build is getting failed by throwing the following error.
Could any one help me on this.
[ERROR] Failed to execute goal on project spark-assembly_2.10: Could not
resolve dependencies for project
1. On HDFS files are treated as ~64mb in block size. When you put the same
file in local file system (ext3/ext4) it will be treated as different (in
your case it looks like ~32mb) and that's why you are seeing 9 output files.
2. You could set *num-executors *to increase the number of executor
Thanks Yanbo!
Modified code below:
val conf = new
SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression)
val ssc = new StreamingContext(conf, Seconds(args(2).toLong))
val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse)
val testData =
This one would give you a better understanding
http://stackoverflow.com/questions/24622108/apache-spark-the-number-of-cores-vs-the-number-of-executors
Thanks
Best Regards
On Wed, Nov 26, 2014 at 10:32 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
1. On HDFS files are treated as ~64mb in
Hi,
I have a question regarding failure of executors: how does Spark reassign
partitions or tasks when executors fail? Is it necessary that new
executors have the same executor IDs as the ones that were lost, or are
these IDs irrelevant for failover?
Spark has a known problem where it will do a pass of metadata on a large
number of small files serially, in order to find the partition information
prior to starting the job. This will probably not be repaired by switching
the FS impl.
However, you can change the FS being used like so (prior to
What's your cluster size? For streamig to work, it needs shards + 1
executors.
On Wed, Nov 26, 2014, 5:53 PM A.K.M. Ashrafuzzaman
ashrafuzzaman...@gmail.com wrote:
Hi guys,
When we are using Kinesis with 1 shard then it works fine. But when we use
more that 1 then it falls into an infinite
Hi,
I guess this is fixed by https://github.com/apache/spark/pull/3110
which is not for complex type casting but makes inserting into hive
table be able to handle complex types ignoring nullability.
I also sent a pull-request (https://github.com/apache/spark/pull/3150)
for complex type casting
Test message
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/This-is-just-a-test-tp19895.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe,
Exactly how the query is executed actually depends on a couple of factors
as we do a bunch of optimizations based on the top physical operator and
the final RDD operation that is performed. In general the compute function
is only used when you are doing SQL followed by other RDD operations (map,
I also modified the example to try 1, 5, 9, ... iterations as you did,
and also ran with the same default parameters. I used the
sample_movielens_data.txt file. Is that what you're using?
My result is:
Iteration 1 Test RMSE = 1.426079653593016 Train RMSE = 1.5013155094216357
Iteration 5 Test
I have found this paper seems to answer most of questions about life
duration.https://www.cs.berkeley.edu/~matei/papers/2012/hotcloud_spark_streaming.pdf
Tian
On Tuesday, November 25, 2014 4:02 AM, Mukesh Jha
me.mukesh@gmail.com wrote:
Hey Experts,
I wanted to understand in
When I am running spark locally, RDD saveAsObjectFile writes the file to
local file system (ex : path /data/temp.txt)
and
when I am running spark on YARN cluster, RDD saveAsObjectFile writes the
file to hdfs. (ex : path /data/temp.txt )
Is there a way to explictly mention local file system
Prepend file:// to the path
Daniel
On 26 בנוב׳ 2014, at 20:15, firemonk9 dhiraj.peech...@gmail.com wrote:
When I am running spark locally, RDD saveAsObjectFile writes the file to
local file system (ex : path /data/temp.txt)
and
when I am running spark on YARN cluster, RDD
Add ³file://³ in front of your path.
On 11/26/14, 10:15 AM, firemonk9 dhiraj.peech...@gmail.com wrote:
When I am running spark locally, RDD saveAsObjectFile writes the file to
local file system (ex : path /data/temp.txt)
and
when I am running spark on YARN cluster, RDD saveAsObjectFile
On Wed, Nov 26, 2014 at 04:06:40PM +, Guy Doulberg wrote:
Hi guys
I started playing with spark streaming, and I came up with an idea that I
wonder if it a valid idea.
Building a jetty input stream, which is basically a jetty server that each
http request it gets it streams .
I am running on a 15 node cluster and am trying to set partitioning to
balance the work across all nodes. I am using an Accumulator to track work
by Mac Address but would prefer to use data known to the Spark environment
- Executor ID, and Function ID show up in the Spark UI and Task ID and
The training RMSE may increase due to regularization. Squared loss
only represents part of the global loss. If you watch the sum of the
squared loss and the regularization, it should be non-increasing.
-Xiangrui
On Wed, Nov 26, 2014 at 9:53 AM, Sean Owen so...@cloudera.com wrote:
I also modified
At 2014-11-26 05:25:10 -0800, Hlib Mykhailenko hlib.mykhaile...@inria.fr
wrote:
I work with Graphx. When I call graph.partitionBy(..) nothing happens,
because, as I understood, that all transformation are lazy and partitionBy is
built using transformations.
Is there way how to force spark
Just to close out this one, I noticed that the cache partition size was quite
low for each of the RDDs (1 - 14). Increasing the number of partitions
(~400) resolved this for me.
--
View this message in context:
Hi Guys,
is there any one experience the same thing as above?
-
--Harihar
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-or-GraphX-runs-fast-a-performance-comparison-on-Page-Rank-tp19710p19909.html
Sent from the Apache Spark User List
I think that actually would not work - yarn-cluster mode expects a specific
deployment path that uses SparkSubmit. Setting master as yarn-client should
work.
-Sandy
On Wed, Nov 26, 2014 at 8:32 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
How about?
- Create a SparkContext
- setMaster as
Hello,
I am a new user to spark, after I've run the spark streaming example
'StatefulNetworkWordCount', I was confused on what was shown on the webUI as
follows:
http://apache-spark-user-list.1001560.n3.nabble.com/file/n19911/uneven.jpg
It seems like almost all the tasks were assigned to the
Are all of your join keys the same? and I guess the join type are all “Left”
join, https://github.com/apache/spark/pull/3362 probably is what you need.
And, SparkSQL doesn’t support the multiway-join (and multiway-broadcast join)
currently, https://github.com/apache/spark/pull/3270 should be
Spark SQL doesn't support the DISTINCT well currently, particularly the case
you described, it will leads all of the data fall into a single node and keep
them in memory only.
Dev community actually has solutions for this, it probably will be solved after
the release of Spark 1.2.
Thanks Sean. That worked out well.
For anyone who happens onto this post and wants to do the same, these are the
steps I took to do as Sean suggested...
(Note this is for a stand alone cluster)
login to the master
~/spark/sbin/stop-all.sh
edit ~/spark/conf/spark-env.sh
modify the line
Just add one more point. If Spark streaming knows when the RDD will not be
used any more, I believe Spark will not try to retrieve data it will not
use any more. However, in practice, I often encounter the error of cannot
compute split. Based on my understanding, this is because Spark cleared
out
I've noticed some strange behavior when I try to use
SchemaRDD.saveAsTable() with a SchemaRDD that I¹ve loaded from a JSON file
that contains elements with nested arrays. For example, with a file
test.json that contains the single line:
{values:[1,2,3]}
and with code like the following:
Can you elaborate on the usage pattern that lead to cannot compute
split ? Are you using the RDDs generated by DStream, outside the
DStream logic? Something like running interactive Spark jobs
(independent of the Spark Streaming ones) on RDDs generated by
DStreams? If that is the case, what is
I guess I already have the answer of what I have to do here, which is to
configure the kryo object with the strategy as above.
Now the question becomes: how can I pass this custom kryo configuration to
the spark kryo serializer / kryo registrator?
I've had a look at the code but I am still fairly
After playing around with this a little more, I discovered that:
1. If test.json contains something like {values:[null,1,2,3]}, the
schema auto-determined by SchemaRDD.jsonFile() will have element: integer
(containsNull = true), and then
SchemaRDD.saveAsTable()/SchemaRDD.insertInto() will work
Instead of SPARK_WORKER_INSTANCES you can also set SPARK_WORKER_CORES, to have
one worker that thinks it has more cores.
Matei
On Nov 26, 2014, at 5:01 PM, Yotto Koga yotto.k...@autodesk.com wrote:
Thanks Sean. That worked out well.
For anyone who happens onto this post and wants to do
Thanks Yanbo,
I wonder why does SSV does not complain when i create using new SSV(4,
Array(1, 3, 5, 7)? Is there no error check for this even in the breeze
sparse vector's constructor? That is very strange
Shivani
On Tue, Nov 25, 2014 at 7:25 PM, Yanbo Liang yanboha...@gmail.com wrote:
Hi
hi,
don't know whether this question should be asked here, if not, please point
me out, thanks.
we are currently using hive on spark, when reading a small int field, it
reports error:
Cannot get field 'i16Val' because union is currently set to i32Val
I googled and find only source code of
In the past I have worked around this problem by avoiding sc.textFile().
Instead I read the data directly inside of a Spark job. Basically, you
start with an RDD where each entry is a file in S3 and then flatMap that
with something that reads the files and returns the lines.
Here's an example:
This has been fixed in Spark 1.1.1 and Spark 1.2
https://issues.apache.org/jira/browse/SPARK-3704
On Wed, Nov 26, 2014 at 7:10 PM, 诺铁 noty...@gmail.com wrote:
hi,
don't know whether this question should be asked here, if not, please
point me out, thanks.
we are currently using hive on
Indeed. That's nice.
Thanks!
yotto
From: Matei Zaharia [matei.zaha...@gmail.com]
Sent: Wednesday, November 26, 2014 6:11 PM
To: Yotto Koga
Cc: Sean Owen; user@spark.apache.org
Subject: Re: configure to run multiple tasks on a core
Instead of
Hi,
I ran into a problem when doing two RDDs join operation. For example,
RDDa: RDD[(String,String)] and RDDb:RDD[(String,Int)]. Then, the result
RDDc:[String,(String,Int)] = RDDa.join(RDDb). But I find the results in RDDc
are incorrect compared with RDDb. What's wrong in join?
--
View this
Did you set spark master as local[*]? If so, then it means that nunber of
executors is equal to number of cores of the machine. Perhaps your mac
machine has more cores (certainly more than number of kinesis shards +1).
Try explicitly setting master as local[N] where N is number of kinesis
shards
Thanks Takuya! Will take a look into it later. And sorry for not being
able to review all the PRs in time recently (mostly because of rushing
Spark 1.2 release and Thanksgiving :) ).
On 11/27/14 1:35 AM, Takuya UESHIN wrote:
Hi,
I guess this is fixed by
What’s the command line you used to build Spark? Notice that you need to
add |-Phive-thriftserver| to build the JDBC Thrift server. This profile
was once removed in in v1.1.0, but added back in v1.2.0 because of
dependency issue introduced by Scala 2.11 support.
On 11/27/14 12:53 AM,
Thanks for your response.
I'm using the following command.
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -DskipTests clean
package
Regards.
--
View this message in context:
I am trying to submit spark streaming program, when i submit batch process
its working.. but when i do the same with spark streaming.. it throws Anyone
please help
14/11/26 17:42:25 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:50016
14/11/26 17:42:25 INFO server.Server:
Hi,
I was going through this paper on Pregel titled, Pregel: A System for
Large-Scale Graph Processing. In the second section named Model Of
Computation, it says that the input to a Pregel computation is a directed
graph.
Is it the same in the Pregel abstraction of GraphX too? Do we always
What version are you trying to build? I was at first assuming you're
using the most recent master, but from your first mail it seems that you
were trying to build Spark v1.1.0?
On 11/27/14 12:57 PM, vdiwakar.malladi wrote:
Thanks for your response.
I'm using the following command.
mvn
Yes, I'm building it from Spark 1.1.0
Thanks in advance.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-generate-assembly-jar-which-includes-jdbc-thrift-server-tp19887p19937.html
Sent from the Apache Spark User List mailing list archive at
Hello Jonathan,
There was a bug regarding casting data types before inserting into a Hive
table. Hive does not have the notion of containsNull for array values.
So, for a Hive table, the containsNull will be always true for an array and
we should ignore this field for Hive. This issue has been
For hive on spark, did you mean the thrift server of Spark SQL or
https://issues.apache.org/jira/browse/HIVE-7292? If you meant the latter
one, I think Hive's mailing list will be a good place to ask (see
https://hive.apache.org/mailing_lists.html).
Thanks,
Yin
On Wed, Nov 26, 2014 at 10:49 PM,
Hm, then the command line you used should be fine. Actually just tried
it locally and it’s fine. Make sure to run it in the root directory of
Spark source tree (don’t |cd| into assembly).
On 11/27/14 1:35 PM, vdiwakar.malladi wrote:
Yes, I'm building it from Spark 1.1.0
Thanks in advance.
Hi,
I'm trying to open the Spark source code with IntelliJ IDEA.
I opened pom.xml on the Spark source code root directory.
Project tree is displayed in the Project tool window.
But, when I open a source file, say
org.apache.spark.deploy.yarn.ClientBase.scala, a lot of red marks shows on
the
What version of Spark are you running? A Python API for Spark Streaming is
only available via GitHub at the moment and has not been released in any
version of Spark.
On Tue, Nov 25, 2014 at 10:23 AM, Venkat, Ankam
ankam.ven...@centurylink.com wrote:
Any idea how to resolve this?
Regards,
Hi,
I've been fiddling with spark/*/storage/blockManagerMasterActor.getPeers()
definition in the context of blockManagerMaster.askDriverWithReply()
sending a request GetPeers().
1) I couldn't understand what the 'selfIndex' is used for?.
2) Also, I tried modifying the 'peers' array by just
Hi,
An information about the error.
On File | Project Structure window, the following error message is displayed
with pink background:
Library 'Maven: org.scala-lang:scala-compiler-bundle:2.10.4' is not used
Can it be a hint?
From: Taeyun Kim [mailto:taeyun@innowireless.com]
Hi Tri,
Maybe my latest responds for your problem is lost, whatever, the following
code snippet can run correctly.
val model = new
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
model.algorithm.setIntercept(true)
Because that all setXXX() function in
I see. As what the exception stated, Maven can’t find |unzip| to help
building PySpark. So you need a Windows version of |unzip| (probably
from MinGW or Cygwin?)
On 11/27/14 2:10 PM, vdiwakar.malladi wrote:
Thanks for your prompt responses.
I'm generating assembly jar file from windows 7
I mean the later... thanks
On Thu, Nov 27, 2014 at 1:42 PM, Yin Huai huaiyin@gmail.com wrote:
For hive on spark, did you mean the thrift server of Spark SQL or
https://issues.apache.org/jira/browse/HIVE-7292? If you meant the latter
one, I think Hive's mailing list will be a good place to
I have a use case where it requires a huge number of keys' state to be
stored and updated with the latest values from the stream. I am planning to
use updateStateByKey with checkpointing.
I would like to know the performance implication on updateStateByKey as the
keys stored in the state grows
Try to add your cluster's core-site.xml, yarn-site.xml, and hdfs-site.xml
to the CLASSPATH (and on SPARK_CLASSPATH) and submit the job.
Thanks
Best Regards
On Thu, Nov 27, 2014 at 12:24 PM, Naveen Kumar Pokala
npok...@spcapitaliq.com wrote:
Code is in my windows machine and cluster is in some
Hi TD,
I am using Spark Streaming to consume data from Kafka and do some
aggregation and ingest the results into RDS. I do use foreachRDD in the
program. I am planning to use Spark streaming in our production pipeline
and it performs well in generating the results. Unfortunately, we plan to
have
I'm waiting online. Who can help me, please?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/GraphX-java-lang-NoSuchMethodError-org-apache-spark-graphx-Graph-apply-tp19958p19959.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
96 matches
Mail list logo