Jinal Shah created CRUNCH-361:
---------------------------------
Summary: Illegal State Exception
Key: CRUNCH-361
URL: https://issues.apache.org/jira/browse/CRUNCH-361
Project: Crunch
Issue Type: Bug
Components: Core
Affects Versions: 0.8.2, 0.9.0
Reporter: Jinal Shah
Assignee: Josh Wills
Priority: Minor
So apparently I was trying to use the ParallelDoOption in order to tell the
planner to do something in a certain way. So when you pass the sourceTarget to
it and do the union or co-group in the steps following that on the PCollection
that was generated it tries to find the size of the parent source which is
still not generated. Here are the steps to produce it
{code}
PCollection<U> collection = afterSomeOperation();
SourceTarget<U> marker = new SourceTarget<U>(pathThatDoesNotExist); // this
could be any SourceTarget implementation
pipeline.write(collection, marker);
PCollection<U> collection2 = pipeline.read(marker);
PCollection<V> collection3 =
collection2.parallelDo(DoFn,PType,ParallelDoOptions.builder().sources(marker).build());
doSomeMoreOperation();
PCollection<V> union = collection3.union(SomePCollectionOfV);
{code}
This will throw the exception since the union will not be able to find the size
of the marker since it is not generated yet. So the planner should know that
the Source is not generated yet and there is a job in the pipeline that will
generate it.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)