Perhaps I don't understand your question correctly. Could you rephrase it? On Thu, Oct 6, 2016 at 3:24 AM, amir bahmanyari <[email protected]> wrote:
> Hi Maximilian, > Slots per TM is = 512. For 4 nodes, that adds up to be 2048 in the cluster. > I set Parallelism(2048) in the code. > Like shown below, all have been consumed and 0 is remained. > Before I start my Beam application, the Available Task Slots = 2048 as > well. > Once started, it goes down to 0 implying all 4 nodes are now consumed and > are processing data. > Its my understanding at least. > Cheers > > > ------------------------------ > *From:* Maximilian Michels <[email protected]> > *To:* [email protected] > *Sent:* Wednesday, October 5, 2016 7:43 AM > *Subject:* Re: Regarding TextIO() > > You have probably configured a parallelism of 1. Try using `-p 2048` to > run with a parallelism of 2048. > > On Wed, Oct 5, 2016 at 7:43 AM, amir bahmanyari <[email protected]> > wrote: > > Thanks JB, > I copied the data file in all nodes. Applied TextIo() in src code. Started > the Flink cluster & deployed the beam fat file to it. Interesting enough, > the dashboard shows that > out of 2048 slots I hav configured for parallelism, only ONE slot is being > used in Flink cluster. > This is not true when i use KafkaIO(). All 2040 get consumed like below. > Have a great eve. > [image: Inline image] > > > > ------------------------------ > *From:* Jean-Baptiste Onofré <[email protected]> > *To:* [email protected] > *Cc:* [email protected] > *Sent:* Tuesday, October 4, 2016 10:29 PM > *Subject:* Re: Regarding TextIO() > > Hi > All depends of the filesystem you are using. If you want all nodes share > the same files then you need a shared filesystem or distributed filesystem > like hdfs (not yet supported by textio) or gs (supported by textio). If you > use file (local filesystem) then the files will be local to each node. > Regards > JB > On Oct 5, 2016, at 02:12, amir bahmanyari <[email protected]> wrote: > > Hi Colleagues, > When you are using TextIO() in a Beam (java) app executing in a Flink > Cluster, do you have to have a copy of the file in every Flink cluster node? > Also, if you want to read the file from one node (probably remote node) > only, how would the TextIO.Read.from("....") look like? > What are the best TextIO() practices in situations like this? > Thanks+regards, > Amir > > > > > > >
