It is reading the files now but throws another error complaining vector sizes does not match. I saw this error reported on stack trace .
http://stackoverflow.com/questions/30737361/getting-java-lang-illegalargumentexception-requirement-failed-while-calling-spa Also example given in scala model.setRandomCenters takes two arguments , where as java method needs 3 ? Any clues ? Thanks Ashutosh On Wed, Apr 27, 2016 at 9:59 PM, Ashutosh Kumar <kmr.ashutos...@gmail.com> wrote: > The problem seems to be streamconxt.textFileStream(path) is not reading > the file at all. It does not throw any exception also. I tried some tricks > given in mailing lists like copying the file to specified directory after > start of program, touching the file to change timestamp etc but no luck. > > Thanks > Ashutosh > > > > On Wed, Apr 27, 2016 at 2:43 PM, Niki Pavlopoulou <n...@exonar.com> wrote: > >> One of the reasons that happened to me (assuming everything is ok on your >> streaming process), is if you run it on local mode instead of local[*] use >> local[4]. >> >> On 26 April 2016 at 15:10, Ashutosh Kumar <kmr.ashutos...@gmail.com> >> wrote: >> >>> I created a Streaming k means based on scala example. It keeps running >>> without any error but never prints predictions >>> >>> Here is Log >>> >>> 19:15:05,050 INFO >>> org.apache.spark.streaming.scheduler.InputInfoTracker - remove old >>> batch metadata: 1461678240000 ms >>> 19:15:10,001 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - Finding new >>> files took 1 ms >>> 19:15:10,001 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - New files >>> at time 1461678310000 ms: >>> >>> 19:15:10,007 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - Finding new >>> files took 2 ms >>> 19:15:10,007 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - New files >>> at time 1461678310000 ms: >>> >>> 19:15:10,014 INFO >>> org.apache.spark.streaming.scheduler.JobScheduler - Added jobs >>> for time 1461678310000 ms >>> 19:15:10,015 INFO >>> org.apache.spark.streaming.scheduler.JobScheduler - Starting >>> job streaming job 1461678310000 ms.0 from job set of time 1461678310000 ms >>> 19:15:10,028 INFO >>> org.apache.spark.SparkContext - Starting >>> job: collect at StreamingKMeans.scala:89 >>> 19:15:10,028 INFO >>> org.apache.spark.scheduler.DAGScheduler - Job 292 >>> finished: collect at StreamingKMeans.scala:89, took 0.000041 s >>> 19:15:10,029 INFO >>> org.apache.spark.streaming.scheduler.JobScheduler - Finished >>> job streaming job 1461678310000 ms.0 from job set of time 1461678310000 ms >>> 19:15:10,029 INFO >>> org.apache.spark.streaming.scheduler.JobScheduler - Starting >>> job streaming job 1461678310000 ms.1 from job set of time 1461678310000 ms >>> ------------------------------------------- >>> Time: 1461678310000 ms >>> ------------------------------------------- >>> >>> 19:15:10,036 INFO >>> org.apache.spark.streaming.scheduler.JobScheduler - Finished >>> job streaming job 1461678310000 ms.1 from job set of time 1461678310000 ms >>> 19:15:10,036 INFO >>> org.apache.spark.rdd.MapPartitionsRDD - Removing >>> RDD 2912 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.MapPartitionsRDD - Removing >>> RDD 2911 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2912 >>> 19:15:10,037 INFO >>> org.apache.spark.streaming.scheduler.JobScheduler - Total >>> delay: 0.036 s for time 1461678310000 ms (execution: 0.021 s) >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.UnionRDD - Removing >>> RDD 2800 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2911 >>> 19:15:10,037 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - Cleared 1 >>> old files that were older than 1461678250000 ms: 1461678245000 ms >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.MapPartitionsRDD - Removing >>> RDD 2917 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2800 >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.MapPartitionsRDD - Removing >>> RDD 2916 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.MapPartitionsRDD - Removing >>> RDD 2915 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.MapPartitionsRDD - Removing >>> RDD 2914 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.rdd.UnionRDD - Removing >>> RDD 2803 from persistence list >>> 19:15:10,037 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - Cleared 1 >>> old files that were older than 1461678250000 ms: 1461678245000 ms >>> 19:15:10,038 INFO >>> org.apache.spark.streaming.scheduler.ReceivedBlockTracker - Deleting >>> batches ArrayBuffer() >>> 19:15:10,038 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2917 >>> 19:15:10,038 INFO >>> org.apache.spark.streaming.scheduler.InputInfoTracker - remove old >>> batch metadata: 1461678245000 ms >>> 19:15:10,038 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2914 >>> 19:15:10,038 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2916 >>> 19:15:10,038 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2915 >>> 19:15:10,038 INFO >>> org.apache.spark.storage.BlockManager - Removing >>> RDD 2803 >>> 19:15:15,001 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - Finding new >>> files took 1 ms >>> 19:15:15,001 INFO >>> org.apache.spark.streaming.dstream.FileInputDStream - New files >>> at time 1461678315000 ms: >>> . >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >> >> >