[jira] [Created] (FLINK-1959) Accumulators BROKEN after Partitioning

2015-04-29 Thread mustafa elbehery (JIRA)
mustafa elbehery created FLINK-1959:
---

 Summary: Accumulators BROKEN after Partitioning
 Key: FLINK-1959
 URL: https://issues.apache.org/jira/browse/FLINK-1959
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Critical
 Fix For: 0.8.1


while running the Accumulator example in 
https://github.com/Elbehery/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/EmptyFieldsCountAccumulator.java,
 

I tried to alter the data flow with "PartitionByHash" function before applying 
"Filter", and the resulted accumulator was NULL. 

By Debugging, I could see the accumulator in the RunTime Map. However, by 
retrieving the accumulator from the JobExecutionResult object, it was NULL. 


The line caused the problem is "file.partitionByHash(1).filter(new 
EmptyFieldFilter())" instead of "file.filter(new EmptyFieldFilter())"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Tweets Custom Input Format

2015-02-27 Thread Mustafa Elbehery
@robert,

I have created the PR https://github.com/apache/flink/pull/442,



On Fri, Feb 27, 2015 at 11:58 AM, Mustafa Elbehery <
elbeherymust...@gmail.com> wrote:

> @Robert,
>
> Thanks I was asking about the procedure. I have opened a Jira ticket for
> Flink-Contrib and I will create a PR with the naming convention on Wiki,
>
> https://issues.apache.org/jira/browse/FLINK-1615,
>
>
>
> On Fri, Feb 27, 2015 at 11:55 AM, Robert Metzger 
> wrote:
>
>> I'm glad you've found the how to contribute guide.
>>
>> I can not describe the process to open a pull request better than already
>> written in the guide.
>> Maybe this link is also helpful for you:
>> https://help.github.com/articles/creating-a-pull-request/
>>
>> Are you facing a particular error message? Maybe that helps me to help you
>> better.
>>
>>
>> On Fri, Feb 27, 2015 at 10:46 AM, Mustafa Elbehery <
>> elbeherymust...@gmail.com> wrote:
>>
>> > Actually I am reading "How to contribute" now to push the code. Its
>> working
>> > and tested locally and on the cluster, and i have used it for an ETL.
>> >
>> > The structure as follow :-
>> >
>> > Java Pojos for the tweet object, and the nested objects.  Parser class
>> > using event-driven approach, and the SimpleTweetInputFormat itself.
>> >
>> > Would you guide me how to push the code, just to save sometime :)
>> >
>> >
>> > On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger 
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > cool! Can you generalize the input format to read JSON into an
>> arbitrary
>> > > POJO?
>> > >
>> > > It would be great if you could contribute the InputFormat into the
>> > > "flink-contrib" module. I've seen many users reading JSON data with
>> > Flink,
>> > > so its good to have a standard solution for that.
>> > > If you want you can add the "Tweet into POJO" as an example into
>> > > flink-contrib.
>> > >
>> > > On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
>> > > elbeherymust...@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I am really sorry for being so late, it was a whole month of
>> projects
>> > and
>> > > > examination, I was really busy.
>> > > >
>> > > > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
>> > > > parser, I retrieve most of the tweet into Java Pojos, it was tested
>> on
>> > > 1TB
>> > > > dataset, for a Flink ETL job, and the performance was pretty good.
>> > > >
>> > > >
>> > > >
>> > > > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <
>> rmetz...@apache.org>
>> > > > wrote:
>> > > >
>> > > > > Hey,
>> > > > >
>> > > > > is it a input format for reading JSON data or an IF for reading
>> > tweets
>> > > in
>> > > > > some format into a pojo?
>> > > > >
>> > > > > I think a JSON Input Format would be something very useful for our
>> > > users.
>> > > > > Maybe you can add that and use the Tweet IF as a concrete example
>> for
>> > > > that?
>> > > > > Do you have a preview of the code somewhere?
>> > > > >
>> > > > > Best,
>> > > > > Robert
>> > > > >
>> > > > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <
>> fhue...@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > > > Hi Mustafa,
>> > > > > >
>> > > > > > that would be a nice contribution!
>> > > > > >
>> > > > > > We are currently discussing how to add "non-core" API features
>> into
>> > > > Flink
>> > > > > > [1].
>> > > > > > I will move this discussion onto the mailing list to decide
>> where
>> > to
>> > > > add
>> > > > > > cool add-ons like yours.
>> > > > > >
>> > > > > > Cheers, Fabian
>> > > > > >
>> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
>> > > >

Re: Tweets Custom Input Format

2015-02-27 Thread Mustafa Elbehery
@Robert,

Thanks I was asking about the procedure. I have opened a Jira ticket for
Flink-Contrib and I will create a PR with the naming convention on Wiki,

https://issues.apache.org/jira/browse/FLINK-1615,



On Fri, Feb 27, 2015 at 11:55 AM, Robert Metzger 
wrote:

> I'm glad you've found the how to contribute guide.
>
> I can not describe the process to open a pull request better than already
> written in the guide.
> Maybe this link is also helpful for you:
> https://help.github.com/articles/creating-a-pull-request/
>
> Are you facing a particular error message? Maybe that helps me to help you
> better.
>
>
> On Fri, Feb 27, 2015 at 10:46 AM, Mustafa Elbehery <
> elbeherymust...@gmail.com> wrote:
>
> > Actually I am reading "How to contribute" now to push the code. Its
> working
> > and tested locally and on the cluster, and i have used it for an ETL.
> >
> > The structure as follow :-
> >
> > Java Pojos for the tweet object, and the nested objects.  Parser class
> > using event-driven approach, and the SimpleTweetInputFormat itself.
> >
> > Would you guide me how to push the code, just to save sometime :)
> >
> >
> > On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger 
> > wrote:
> >
> > > Hi,
> > >
> > > cool! Can you generalize the input format to read JSON into an
> arbitrary
> > > POJO?
> > >
> > > It would be great if you could contribute the InputFormat into the
> > > "flink-contrib" module. I've seen many users reading JSON data with
> > Flink,
> > > so its good to have a standard solution for that.
> > > If you want you can add the "Tweet into POJO" as an example into
> > > flink-contrib.
> > >
> > > On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> > > elbeherymust...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I am really sorry for being so late, it was a whole month of projects
> > and
> > > > examination, I was really busy.
> > > >
> > > > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> > > > parser, I retrieve most of the tweet into Java Pojos, it was tested
> on
> > > 1TB
> > > > dataset, for a Flink ETL job, and the performance was pretty good.
> > > >
> > > >
> > > >
> > > > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger  >
> > > > wrote:
> > > >
> > > > > Hey,
> > > > >
> > > > > is it a input format for reading JSON data or an IF for reading
> > tweets
> > > in
> > > > > some format into a pojo?
> > > > >
> > > > > I think a JSON Input Format would be something very useful for our
> > > users.
> > > > > Maybe you can add that and use the Tweet IF as a concrete example
> for
> > > > that?
> > > > > Do you have a preview of the code somewhere?
> > > > >
> > > > > Best,
> > > > > Robert
> > > > >
> > > > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske  >
> > > > wrote:
> > > > >
> > > > > > Hi Mustafa,
> > > > > >
> > > > > > that would be a nice contribution!
> > > > > >
> > > > > > We are currently discussing how to add "non-core" API features
> into
> > > > Flink
> > > > > > [1].
> > > > > > I will move this discussion onto the mailing list to decide where
> > to
> > > > add
> > > > > > cool add-ons like yours.
> > > > > >
> > > > > > Cheers, Fabian
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > > > > >
> > > > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <
> henry.sapu...@gmail.com
> > >:
> > > > > >
> > > > > > > Contributions are welcomed!
> > > > > > >
> > > > > > > Here is the link on how to contribute to Apache Flink:
> > > > > > > http://flink.apache.org/how-to-contribute.html
> > > > > > >
> > > > > > > You can start by creating JIRA ticket [1] to help describe what
> > you
> > > > > > > wanted to do and to get feedback from community.
> >

[jira] [Created] (FLINK-1615) Introduces a new InputFormat for Tweets

2015-02-27 Thread mustafa elbehery (JIRA)
mustafa elbehery created FLINK-1615:
---

 Summary: Introduces a new InputFormat for Tweets
 Key: FLINK-1615
 URL: https://issues.apache.org/jira/browse/FLINK-1615
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Minor


An event-driven parser for Tweets into Java Pojos. 

It parses all the important part of the tweet into Java objects. 

Tested on cluster and the performance in pretty well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Tweets Custom Input Format

2015-02-27 Thread Mustafa Elbehery
Actually I am reading "How to contribute" now to push the code. Its working
and tested locally and on the cluster, and i have used it for an ETL.

The structure as follow :-

Java Pojos for the tweet object, and the nested objects.  Parser class
using event-driven approach, and the SimpleTweetInputFormat itself.

Would you guide me how to push the code, just to save sometime :)


On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger 
wrote:

> Hi,
>
> cool! Can you generalize the input format to read JSON into an arbitrary
> POJO?
>
> It would be great if you could contribute the InputFormat into the
> "flink-contrib" module. I've seen many users reading JSON data with Flink,
> so its good to have a standard solution for that.
> If you want you can add the "Tweet into POJO" as an example into
> flink-contrib.
>
> On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> elbeherymust...@gmail.com> wrote:
>
> > Hi,
> >
> > I am really sorry for being so late, it was a whole month of projects and
> > examination, I was really busy.
> >
> > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> > parser, I retrieve most of the tweet into Java Pojos, it was tested on
> 1TB
> > dataset, for a Flink ETL job, and the performance was pretty good.
> >
> >
> >
> > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger 
> > wrote:
> >
> > > Hey,
> > >
> > > is it a input format for reading JSON data or an IF for reading tweets
> in
> > > some format into a pojo?
> > >
> > > I think a JSON Input Format would be something very useful for our
> users.
> > > Maybe you can add that and use the Tweet IF as a concrete example for
> > that?
> > > Do you have a preview of the code somewhere?
> > >
> > > Best,
> > > Robert
> > >
> > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske 
> > wrote:
> > >
> > > > Hi Mustafa,
> > > >
> > > > that would be a nice contribution!
> > > >
> > > > We are currently discussing how to add "non-core" API features into
> > Flink
> > > > [1].
> > > > I will move this discussion onto the mailing list to decide where to
> > add
> > > > cool add-ons like yours.
> > > >
> > > > Cheers, Fabian
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > > >
> > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra :
> > > >
> > > > > Contributions are welcomed!
> > > > >
> > > > > Here is the link on how to contribute to Apache Flink:
> > > > > http://flink.apache.org/how-to-contribute.html
> > > > >
> > > > > You can start by creating JIRA ticket [1] to help describe what you
> > > > > wanted to do and to get feedback from community.
> > > > >
> > > > >
> > > > > - Henry
> > > > >
> > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > > > >
> > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > > >  wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have created a custom InputFormat for tweets on Flink, based on
> > > > > > JSON-Simple event driven parser. I would like to contribute my
> work
> > > > into
> > > > > > Flink,
> > > > > >
> > > > > > Regards.
> > > > > >
> > > > > > --
> > > > > > Mustafa Elbehery
> > > > > > EIT ICT Labs Master School <
> > > > http://www.masterschool.eitictlabs.eu/home/>
> > > > > > +49(0)15218676094
> > > > > > skype: mustafaelbehery87
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Mustafa Elbehery
> > EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> > +49(0)15218676094
> > skype: mustafaelbehery87
> >
>



-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87


Re: Tweets Custom Input Format

2015-02-27 Thread Mustafa Elbehery
Hi,

I am really sorry for being so late, it was a whole month of projects and
examination, I was really busy.

@Robert, it is IF for reading tweet into Pojo. I use an event-driven
parser, I retrieve most of the tweet into Java Pojos, it was tested on 1TB
dataset, for a Flink ETL job, and the performance was pretty good.



On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger  wrote:

> Hey,
>
> is it a input format for reading JSON data or an IF for reading tweets in
> some format into a pojo?
>
> I think a JSON Input Format would be something very useful for our users.
> Maybe you can add that and use the Tweet IF as a concrete example for that?
> Do you have a preview of the code somewhere?
>
> Best,
> Robert
>
> On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske  wrote:
>
> > Hi Mustafa,
> >
> > that would be a nice contribution!
> >
> > We are currently discussing how to add "non-core" API features into Flink
> > [1].
> > I will move this discussion onto the mailing list to decide where to add
> > cool add-ons like yours.
> >
> > Cheers, Fabian
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-1398
> >
> > 2015-01-23 20:42 GMT+01:00 Henry Saputra :
> >
> > > Contributions are welcomed!
> > >
> > > Here is the link on how to contribute to Apache Flink:
> > > http://flink.apache.org/how-to-contribute.html
> > >
> > > You can start by creating JIRA ticket [1] to help describe what you
> > > wanted to do and to get feedback from community.
> > >
> > >
> > > - Henry
> > >
> > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > >
> > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > >  wrote:
> > > > Hi,
> > > >
> > > > I have created a custom InputFormat for tweets on Flink, based on
> > > > JSON-Simple event driven parser. I would like to contribute my work
> > into
> > > > Flink,
> > > >
> > > > Regards.
> > > >
> > > > --
> > > > Mustafa Elbehery
> > > > EIT ICT Labs Master School <
> > http://www.masterschool.eitictlabs.eu/home/>
> > > > +49(0)15218676094
> > > > skype: mustafaelbehery87
> > >
> >
>



-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87


Tweets Custom Input Format

2015-01-23 Thread Mustafa Elbehery
Hi,

I have created a custom InputFormat for tweets on Flink, based on
JSON-Simple event driven parser. I would like to contribute my work into
Flink,

Regards.

-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87