Re: Streaming K-means not printing predictions

Ashutosh Kumar Thu, 28 Apr 2016 05:36:49 -0700

It is reading the files now but throws another error complaining vector
sizes does not match. I saw this error reported on stack trace .


http://stackoverflow.com/questions/30737361/getting-java-lang-illegalargumentexception-requirement-failed-while-calling-spa

Also example given in scala model.setRandomCenters takes two arguments ,
where as java method needs 3 ?

Any clues ?
Thanks
Ashutosh


On Wed, Apr 27, 2016 at 9:59 PM, Ashutosh Kumar <kmr.ashutos...@gmail.com>
wrote:

> The problem seems to be streamconxt.textFileStream(path) is not reading
> the file at all. It does not throw any exception also. I tried some tricks
> given in mailing lists  like copying the file to specified directory  after
> start of program, touching the file to change timestamp etc but no luck.
>
> Thanks
> Ashutosh
>
>
>
> On Wed, Apr 27, 2016 at 2:43 PM, Niki Pavlopoulou <n...@exonar.com> wrote:
>
>> One of the reasons that happened to me (assuming everything is ok on your
>> streaming process), is if you run it on local mode instead of local[*] use
>> local[4].
>>
>> On 26 April 2016 at 15:10, Ashutosh Kumar <kmr.ashutos...@gmail.com>
>> wrote:
>>
>>> I created a Streaming k means based on scala example. It keeps running
>>> without any error but never prints predictions
>>>
>>> Here is Log
>>>
>>> 19:15:05,050 INFO
>>> org.apache.spark.streaming.scheduler.InputInfoTracker         - remove old
>>> batch metadata: 1461678240000 ms
>>> 19:15:10,001 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - Finding new
>>> files took 1 ms
>>> 19:15:10,001 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - New files
>>> at time 1461678310000 ms:
>>>
>>> 19:15:10,007 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - Finding new
>>> files took 2 ms
>>> 19:15:10,007 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - New files
>>> at time 1461678310000 ms:
>>>
>>> 19:15:10,014 INFO
>>> org.apache.spark.streaming.scheduler.JobScheduler             - Added jobs
>>> for time 1461678310000 ms
>>> 19:15:10,015 INFO
>>> org.apache.spark.streaming.scheduler.JobScheduler             - Starting
>>> job streaming job 1461678310000 ms.0 from job set of time 1461678310000 ms
>>> 19:15:10,028 INFO
>>> org.apache.spark.SparkContext                                 - Starting
>>> job: collect at StreamingKMeans.scala:89
>>> 19:15:10,028 INFO
>>> org.apache.spark.scheduler.DAGScheduler                       - Job 292
>>> finished: collect at StreamingKMeans.scala:89, took 0.000041 s
>>> 19:15:10,029 INFO
>>> org.apache.spark.streaming.scheduler.JobScheduler             - Finished
>>> job streaming job 1461678310000 ms.0 from job set of time 1461678310000 ms
>>> 19:15:10,029 INFO
>>> org.apache.spark.streaming.scheduler.JobScheduler             - Starting
>>> job streaming job 1461678310000 ms.1 from job set of time 1461678310000 ms
>>> -------------------------------------------
>>> Time: 1461678310000 ms
>>> -------------------------------------------
>>>
>>> 19:15:10,036 INFO
>>> org.apache.spark.streaming.scheduler.JobScheduler             - Finished
>>> job streaming job 1461678310000 ms.1 from job set of time 1461678310000 ms
>>> 19:15:10,036 INFO
>>> org.apache.spark.rdd.MapPartitionsRDD                         - Removing
>>> RDD 2912 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.MapPartitionsRDD                         - Removing
>>> RDD 2911 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2912
>>> 19:15:10,037 INFO
>>> org.apache.spark.streaming.scheduler.JobScheduler             - Total
>>> delay: 0.036 s for time 1461678310000 ms (execution: 0.021 s)
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.UnionRDD                                 - Removing
>>> RDD 2800 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2911
>>> 19:15:10,037 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - Cleared 1
>>> old files that were older than 1461678250000 ms: 1461678245000 ms
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.MapPartitionsRDD                         - Removing
>>> RDD 2917 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2800
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.MapPartitionsRDD                         - Removing
>>> RDD 2916 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.MapPartitionsRDD                         - Removing
>>> RDD 2915 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.MapPartitionsRDD                         - Removing
>>> RDD 2914 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.rdd.UnionRDD                                 - Removing
>>> RDD 2803 from persistence list
>>> 19:15:10,037 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - Cleared 1
>>> old files that were older than 1461678250000 ms: 1461678245000 ms
>>> 19:15:10,038 INFO
>>> org.apache.spark.streaming.scheduler.ReceivedBlockTracker     - Deleting
>>> batches ArrayBuffer()
>>> 19:15:10,038 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2917
>>> 19:15:10,038 INFO
>>> org.apache.spark.streaming.scheduler.InputInfoTracker         - remove old
>>> batch metadata: 1461678245000 ms
>>> 19:15:10,038 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2914
>>> 19:15:10,038 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2916
>>> 19:15:10,038 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2915
>>> 19:15:10,038 INFO
>>> org.apache.spark.storage.BlockManager                         - Removing
>>> RDD 2803
>>> 19:15:15,001 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - Finding new
>>> files took 1 ms
>>> 19:15:15,001 INFO
>>> org.apache.spark.streaming.dstream.FileInputDStream           - New files
>>> at time 1461678315000 ms:
>>> .
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>
>>
>

Re: Streaming K-means not printing predictions

Reply via email to