Re: Restricting emit speed of AbstractFileInputOperator.

2017-04-07 Thread Yogi Devendra
Ambarish,

What you are asking for would be useful for others as well.

Would you mind contributing this change to the community?
If you are interested, please create JIRA on
https://issues.apache.org/jira/browse/APEXMALHAR for your request.

For any dev discussions, please use dev@apex mailing list.

~ Yogi

On 7 April 2017 at 01:14, Bhupesh Chawda <bhup...@datatorrent.com> wrote:

> You can set emitBatchSize to 1 and make sure emitTuples is called just 'x'
> times within a window. You can do this manually by keeping a count and
> resetting it in beginWindow().
>
> ~ Bhupesh
>
>
> ___
>
> Bhupesh Chawda
>
> E: bhup...@datatorrent.com | Twitter: @bhupeshsc
>
> www.datatorrent.com  |  apex.apache.org
>
>
>
> On Fri, Apr 7, 2017 at 1:38 PM, Ambarish Pande <
> ambarish.pande2...@gmail.com> wrote:
>
>> Yes i tried. That just gives me control on how many times emitTuples is
>> called. I want control on number of tuples emitted.
>>
>> Thank you.
>>
>> Sent from my iPhone
>>
>> On 07-Apr-2017, at 8:08 AM, Yogi Devendra <devendra.vyavah...@gmail.com>
>> wrote:
>>
>> Have you tried *emitBatchSize *as mentioned https://apex.apache.
>> org/docs/malhar/operators/fsInputOperator/
>>
>> ~ Yogi
>>
>> On 3 April 2017 at 00:05, Ambarish Pande <ambarish.pande2...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> How can i make the AbstractFileInputOperator emit only 'x' number of
>>> lines per window. Is there a hook for that. Or i have to do it manually?
>>>
>>> Thank You.
>>>
>>
>>
>


Re: Setting variables of Application class using properties.xml

2017-04-07 Thread Yogi Devendra
You can access properties defined  in properties.xml in populateDAG. You
can set variables Application.java by reading conf in populateDAG()

~ Yogi

On 7 April 2017 at 11:02, Ambarish Pande 
wrote:

> Hello,
>
> Is there a way by which i can set some variable declared in my
> Application.java through properties.xml. I know i can set Operator
> variables using properties.xml like
> dt.app.operator.prop. I want to do the same thing to set a variable of my
>  Application Class which has the populateDag method.
>
> Thank You.
>


Re: Restricting emit speed of AbstractFileInputOperator.

2017-04-06 Thread Yogi Devendra
Have you tried *emitBatchSize *as mentioned
https://apex.apache.org/docs/malhar/operators/fsInputOperator/

~ Yogi

On 3 April 2017 at 00:05, Ambarish Pande 
wrote:

> Hi,
>
> How can i make the AbstractFileInputOperator emit only 'x' number of lines
> per window. Is there a hook for that. Or i have to do it manually?
>
> Thank You.
>


Re: DB vs HDFS for DataTorrent

2017-01-25 Thread Yogi Devendra
Chiranjeevi,


   - HDFS works as distributed system. Thus, reads can be served from
   different nodes at the source.
   - Not all databases are distributed. If your database server is not
   distributed then you might face issues for parallel read beyond certain no.
   of partitions (say 4-5 partitions)
   - Ready to use applications for these usecases are available on
   https://www.datatorrent.com/apphub/
   - Source code for these apps is Apache licensed under :
   https://github.com/datatorrent/app-templates
   - I would suggest to do some sample tests for the workloads you are
   looking for and take the decision. Kindly share your results for the
   benefit of the community.

~ Yogi

On 25 January 2017 at 14:17, chiranjeevi vasupilli 
wrote:

> Hi Team,
>
> Can you please provide the pointers for using the Data Base vs HDFS as
> source data for Data Torrent tool.
>
> Currenlry we are using HDFS to read the data as source and would like to
> know the proc/cons , if we swith to Data base as source system for data.
>
> Please sugges.
> --
> ur's
> chiru
>


Re: monitoring STRAM events for killed containers

2016-11-25 Thread Yogi Devendra
Have you tried http://docs.datatorrent.com/dtmanage/ ?

~ Yogi

On Fri, Nov 25, 2016 at 11:23 AM, chiranjeevi vasupilli  wrote:

> Hi Team,
>
> I would like to monitor my running application so as to find any
> containers getting killed while job is running.
>
> can you please suggest the best approach and reference docs.
>
> Thanks
> Chiranjeevi V
>


Re: Which is the main class

2016-11-16 Thread Yogi Devendra
Dimple,

To run the applications locally, please refer to README files from the
examples:
e.g.
https://github.com/DataTorrent/examples/tree/master/tutorials/filter

ApplicationTest.java describes how to run it in local mode.

If you want to launch it in local/single node hadoop cluster, you can use
apex cli from
https://github.com/apache/apex-core/tree/master/engine/src/main/scripts

~ Yogi

On 17 November 2016 at 02:34, Dimple Patel  wrote:

> Hello,
>
> To run the application jar locally, which class should be used as a main
> class?
>
> Thanks,
> Dimple
>
>
>
> --
> View this message in context: http://apache-apex-users-list.
> 78494.x6.nabble.com/Which-is-the-main-class-tp1131.html
> Sent from the Apache Apex Users list mailing list archive at Nabble.com.
>


Re: Zero byte file For FileSplitter

2016-10-03 Thread Yogi Devendra
FileSplitter would create FileMetaData for zero length file.
But, there will not be any BlockMetaData for zero length file.

On the receiving side, you can check for if condition based
on fileMetadata.getNumberOfBlocks().

For example you can refer this:
https://github.com/apache/apex-malhar/blob/763d14fca6b84fdda1b6853235e5d4b71ca87fca/library/src/main/java/com/datatorrent/lib/io/fs/Synchronizer.java#L127

~ Yogi

On 3 October 2016 at 12:46, chiranjeevi vasupilli 
wrote:

> Hi team,
>
> I would like to know How FileSplitter will handle  read Zero Byte files .
>
>
> --
> ur's
> chiru
>


Re: Configuring StatelessThroughputBasedPartitioner through xml

2016-08-05 Thread Yogi Devendra
Thanks Shubham.
But, looks like there is no way to achieve it with zero code in java (pure
xml).

~ Yogi

On 5 August 2016 at 12:41, Shubham Pathak <shub...@datatorrent.com> wrote:

> Hello Yogi,
>
> Please refer to mobile demo
> https://github.com/apache/apex-malhar/blob/master/demos/
> mobile/src/main/java/com/datatorrent/demos/mobile/Application.java
>
> Thanks,
> Shubham
>
> On Fri, Aug 5, 2016 at 12:38 PM, Yogi Devendra <
> devendra.vyavah...@gmail.com> wrote:
>
>> Hi,
>>
>> I am using StatelessThroughputBasedPartitioner in my application.
>> I want to configure maximumEvents, minimumEvents for this through xml.
>>
>> How to configure these?
>>
>> ~ Yogi
>>
>
>


Configuring StatelessThroughputBasedPartitioner through xml

2016-08-05 Thread Yogi Devendra
Hi,

I am using StatelessThroughputBasedPartitioner in my application.
I want to configure maximumEvents, minimumEvents for this through xml.

How to configure these?

~ Yogi


Sub Partitioning the parallel partitions

2016-07-25 Thread Yogi Devendra
Hi,

I have a DAG A->B->C.

1. A is kafka input operator reading from 4 different topics configured
with ONE_TO_ONE strategy. Thus creating 4 partitons of A.
2. B and C are configured to have parallel partitions w.r.t. their input
port. Thus, currently both B, C have 4 partitions.

I am observing B significant latency in B operator. Thus, would like to
have 2 partitions for B per partition of A. Since, application is latency
intensive, I want to avoid unifiers as far as possible.

How to achieve this partitioning?


~ Yogi


Re: hdfs output file operator

2016-07-23 Thread Yogi Devendra
@Rahul
Send an email to dev-unsubscr...@apex.apache.org
from your registered email.

~ Yogi

On 23 July 2016 at 21:03, Rahul More  wrote:

> Please stop the mails
> On Jul 23, 2016 10:41 AM, "Chinmay Kolhatkar" 
> wrote:
>
>> Hi Raja,
>>
>> I can see such a log message in AbstractFileOutputOperator at line 455.
>>
>> As this code is called from setup of the operator, the operator is
>> getting deployed and then failing while restoring existing file because of
>> mismatch in length of the file and the offset the operator has stored
>> previously.
>>
>> From the code it looks like it takes care of such cases and restores the
>> file.
>>
>> From what I understand either the file got changes by some other way or
>> the offset management has a problem.
>>
>> Are you restarting the application from previous application Id?
>>
>> To narrow down the problem, can you please try to change the destination
>> path and see if that works?
>>
>> Thanks,
>> Chinmay.
>>
>>
>>
>> On Sat, Jul 23, 2016 at 5:00 AM, Sandesh Hegde 
>> wrote:
>>
>>> Please check,
>>>  1. AppMaster logs
>>>  2. Cluster resources
>>>
>>> On Fri, Jul 22, 2016 at 1:14 PM Raja.Aravapalli <
>>> raja.aravapa...@target.com> wrote:
>>>

 Hi,

 I have File output operator which writes to hdfs files!!

 Application is trying to deploy the operator which writes to hdfs files
 in many different containers for a long time… but is not succeeding!!!
 Status is showing as PENDING_DEPLOY

 In the logs of the container which the Application is trying to deploy
 hdfs write operator, I can only see, path corrupted!!


 Can someone please guide or suggest me on this ?



 Regards,
 Raja.

>>>
>>


CsvFormatter schema for enum property

2016-07-18 Thread Yogi Devendra
Hi,

We have a java object which has a enum field. We wish to send these objects
to CSVFormatter.
What should I specify for "type" of this field?

For example,

My Java Object is :
https://github.com/yogidevendra/examples/blob/13cd0ec8d24b5e5b125fe7567c18197b0a9a8f7e/tutorials/filter/src/main/java/com/datatorrent/tutorial/filter/TransactionPOJO.java

I am using schema as follows:
{"separator": "|","quoteChar":"\"","lineDelimiter":"\n","fields": [{"name":
"trasactionId","type": "long"},{"name": "amount","type": "double"},{"name":
"type","type":
"com.datatorrent.tutorial.filter.TransactionPOJO.TRANSACTION_TYPE"}]}

But, it gives following exception:

Abandoning deployment due to setup failure.
java.lang.IllegalArgumentException: No enum constant
com.datatorrent.contrib.parser.DelimitedSchema.FieldType.COM.DATATORRENT.TUTORIAL.FILTER.TRANSACTIONPOJO.TRANSACTION_TYPE
at java.lang.Enum.valueOf(Enum.java:236)
at
com.datatorrent.contrib.parser.DelimitedSchema$FieldType.valueOf(DelimitedSchema.java:161)
at
com.datatorrent.contrib.parser.DelimitedSchema$Field.(DelimitedSchema.java:289)
at
com.datatorrent.contrib.parser.DelimitedSchema.initialize(DelimitedSchema.java:199)
at
com.datatorrent.contrib.parser.DelimitedSchema.(DelimitedSchema.java:169)
at
com.datatorrent.contrib.formatter.CsvFormatter.setup(CsvFormatter.java:128)
at
com.datatorrent.contrib.formatter.CsvFormatter.setup(CsvFormatter.java:67)
at com.datatorrent.stram.engine.Node.setup(Node.java:187)
at
com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1309)
at
com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)
at
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)

~ Yogi


Re: Regarding using Scala to develop Apex app.

2016-07-15 Thread Yogi Devendra
Akshay, Ankit

Yes. Apex would have full support for scala in future. We understand that
many developers would love to use scala or python binding. We would
definitely consider this as a user feedback and it will be funneled through
the items listed for the project roadmap.

But, it would be hard to give specific timeline regarding this. Priorities
are decided based on the importance of the feature as well as number of
people are asking for it.

On a side note, you can also contribute on this effort.


~ Yogi

On 15 July 2016 at 11:44, Ankit Sarraf  wrote:

> Interesting. Thanks for sharing the link. I surly feel that if DataTorrent
> has to expand its roots, it will have to evolve to support as many
> languages seamlessly as possible. So it seems to have complete support for
> Scala in future.
>
> Also, curious to know if people at DT are thinking of allowing SBT
> application build with the apps?
>
> On Jul 14, 2016 11:07 PM, "Akshay S Harale" 
> wrote:
>
> Hello,
>
> I found one blog post on writing apex app in scala
> .
> First I tried simple app it worked very well but when I introduced some
> anonymous functions in program, it started throwing kryo serialisation
> exception:
> *com.esotericsoftware.kryo.KryoException: Class cannot be created (missing
> no-arg constructor): com.sample.Aggregator$$anon$1*
>
> My question is : Will Apex have full support for scala in future ?
>
> Regards,
> Akshay S. Harale
> Software Developer @ Synerzip
> Skype – akshayharale
>
> This e-mail, including any attached files, may contain confidential and
> privileged information for the sole use of the intended recipient. Any
> review, use, distribution, or disclosure by others is strictly prohibited.
> If you are not the intended recipient (or authorized to receive information
> for the intended recipient), please contact the sender by reply e-mail and
> delete all copies of this message.
>
>
>