Edi Bice created SAMOA-58:
-----------------------------
Summary: Samoa AvroFileStream from HDFSFileStreamSource stops at
end of first file
Key: SAMOA-58
URL: https://issues.apache.org/jira/browse/SAMOA-58
Project: SAMOA
Issue Type: Bug
Components: SAMOA-Instances
Environment: RHEL 6.6, java 1.8.0_72
Reporter: Edi Bice
It appears Samoa is capable of streaming a collection of files as a single
stream effectively concatenating the files. However using Samoa AvroFileStream
from HDFSFileStreamSource seems the stream stops at end of first file:
bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar
"PrequentialEvaluation -i -1 -l (classifiers.ensemble.Bagging -s 100) -s
(AvroFileStream -s HDFSFileStreamSource -f
/tmp/order_and_feats_flat_avro/2016_02_18/ -c 1 -e binary) -f 10000"
2016-02-18 20:43:20,991 [main] INFO
org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:183) -
last event is received!
2016-02-18 20:43:20,991 [main] INFO
org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:184) -
total count: 262144
...
2016-02-18 20:43:20,993 [main] INFO
org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:191) -
total evaluation time: 34 seconds for 262144 instances
bash-4.1$ hadoop fs -ls /tmp/order_and_feats_flat_avro/2016_02_18 | more
Found 70 items
-rw-r--r-- 3 yarn hdfs 230855335 2016-02-18 16:01 /tmp/order_and_feats_flat_a
vro/2016_02_18/hdfs-1a238673-c4ec-4462-be67-78d573efa790-00001
-rw-r--r-- 3 yarn hdfs 229800273 2016-02-18 16:04 /tmp/order_and_feats_flat_a
vro/2016_02_18/hdfs-1a238673-c4ec-4462-be67-78d573efa790-00002
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)