Re: On (FLINK-1526) JIRA issue

2016-09-22 Thread Vasiliki Kalavri
Exactly :) That's why we haven't added neither the spanning tree nor the
strongly connected components algorithms yet.

On Sep 22, 2016 12:16 PM, "Stephan Ewen"  wrote:

> Just as a general comment:
>
> A program with nested loops is most likely not going to be performant on
> any way. It makes sense to re-think the algorithm, come up with a modified
> or different pattern, rather than trying to implement the exact algorithm
> line by line.
>
> It may be worth checking that, because I am not sure if Gelly should have
> algorithms that don't perform well.
>
> On Thu, Sep 22, 2016 at 11:40 AM, Vasiliki Kalavri <
> vasilikikala...@gmail.com> wrote:
>
> > Hi Olga,
> >
> > when you use mapEdges() or mapVertices() with generics, Flink cannot
> > determine the type because of type erasure, like the exception says.
> That's
> > why we also provide methods that take the type information as a
> parameter.
> > You can use those to make the return type explicit. In your example, you
> > should do something like the following (line 41):
> >
> > final TypeInformation longType = BasicTypeInfo.LONG_TYPE_INFO;
> > final TypeInformation doubleType = BasicTypeInfo.DOUBLE_TYPE_
> INFO;
> > Graph> graphOut =
> > graph.mapEdges(new InitializeEdges(), new
> > TupleTypeInfo(longType, longType,
> > new TupleTypeInfo>(doubleType,
> > longType,longType)));
> >
> > Regarding the nested loops, I am almost sure that you will face problems
> if
> > you try to experiment with large datasets. I haven't looked into your
> code
> > yet, but according to the JIRA discussion, we've faced this problem
> before
> > and afaik, this is still an issue.
> >
> > Cheers,
> > -Vasia.
> >
> > On 22 September 2016 at 01:12, Olga Golovneva 
> wrote:
> >
> > > Hi Vasia,
> > >
> > > I have uploaded these tests on github:
> > > https://github.com/OlgaGolovneva/MST/tree/master/tests
> > >
> > > I have also uploaded source code, but I'm still working on it:
> > > https://github.com/OlgaGolovneva/MST/tree/master/src
> > >
> > > ​>I think you cannot add attachments to the mailing list. Could you
> > upload
> > > >your example somewhere and post a link here? I'm actually surprised
> that
> > > >the while-loop works without problems.
> > >
> > > I have run the program on several simple tests, and I was going to try
> > > large datasets in the next few days. Please, let me know if this
> approach
> > > is wrong.
> > >
> > > Thanks,
> > > Olga
> > >
> > > On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri <
> > > vasilikikala...@gmail.com
> > > > wrote:
> > >
> > > > Hi Olga,
> > > >
> > > > On 21 September 2016 at 18:50, Olga Golovneva 
> > > wrote:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > I was working on  (FLINK-1526) "Add Minimum Spanning Tree library
> > > method
> > > > > and example" issue. I've developed (Java) code that implements
> > > > distributed
> > > > > Boruvka's algorithm in Gelly library. I've run several tests and it
> > > seems
> > > > > to work fine, although I didn't test it on extremely large input
> > graphs
> > > > > yet, and I'm also trying to optimize my code.
> > > > > Particularly, I have two main issues:
> > > > >
> > > > > 1. Nested loops.
> > > > > I have to use nested loops, and I do not see the way to avoid them.
> > As
> > > > > they are currently not supported, I'm using Bulk Iterations inside
> a
> > > > > "classic" while loop. I've included in attachment simple example
> > > > > MyNestedIterationExample that shows this issue.
> > > > >
> > > >
> > > > ​I think you cannot add attachments to the mailing list. Could you
> > upload
> > > > your example somewhere and post a link here? I'm actually surprised
> > that
> > > > the while-loop works without problems.
> > > >
> > > >
> > > > >
> > > > > 2. For some reason I cannot create class that works with types with
> > > > > generic variables in Tuple2(or Tuple3), thus my code does not
> support
> > > > > generic types. I also included simple example MyTuple3Example. Here
> > is
> > > > the
> > > > > Exception I get:
> > > > > "Exception in thread "main" org.apache.flink.api.common.functions.
> > > > InvalidTypesException:
> > > > > Type of TypeVariable 'EV' in 'class org.apache.flink.graph.
> > > > > examples.MyTuple3Example$InitializeEdges' could not be determined.
> > > This
> > > > > is most likely a type erasure problem. The type extraction
> currently
> > > > > supports types with generic variables only in cases where all
> > variables
> > > > in
> > > > > the return type can be deduced from the input type(s)."
> > > > >
> > > >
> > > > ​Can you upload this example and link to it too?
> > > >
> > > > Thanks,
> > > > -Vasia.
> > > >
> > > >
> > > > >
> > > > > I would really appreciate if someone could explain me know how to
> > avoid
> > > > > this Exception. Otherwise, I could submit my code for testing.
> > > > >
> > > > > Best regards,
> > > > > Olga Golovneva
> > > > >
> > > >
> > >
> >
>


Re: On (FLINK-1526) JIRA issue

2016-09-22 Thread Stephan Ewen
Just as a general comment:

A program with nested loops is most likely not going to be performant on
any way. It makes sense to re-think the algorithm, come up with a modified
or different pattern, rather than trying to implement the exact algorithm
line by line.

It may be worth checking that, because I am not sure if Gelly should have
algorithms that don't perform well.

On Thu, Sep 22, 2016 at 11:40 AM, Vasiliki Kalavri <
vasilikikala...@gmail.com> wrote:

> Hi Olga,
>
> when you use mapEdges() or mapVertices() with generics, Flink cannot
> determine the type because of type erasure, like the exception says. That's
> why we also provide methods that take the type information as a parameter.
> You can use those to make the return type explicit. In your example, you
> should do something like the following (line 41):
>
> final TypeInformation longType = BasicTypeInfo.LONG_TYPE_INFO;
> final TypeInformation doubleType = BasicTypeInfo.DOUBLE_TYPE_INFO;
> Graph> graphOut =
> graph.mapEdges(new InitializeEdges(), new
> TupleTypeInfo(longType, longType,
> new TupleTypeInfo>(doubleType,
> longType,longType)));
>
> Regarding the nested loops, I am almost sure that you will face problems if
> you try to experiment with large datasets. I haven't looked into your code
> yet, but according to the JIRA discussion, we've faced this problem before
> and afaik, this is still an issue.
>
> Cheers,
> -Vasia.
>
> On 22 September 2016 at 01:12, Olga Golovneva  wrote:
>
> > Hi Vasia,
> >
> > I have uploaded these tests on github:
> > https://github.com/OlgaGolovneva/MST/tree/master/tests
> >
> > I have also uploaded source code, but I'm still working on it:
> > https://github.com/OlgaGolovneva/MST/tree/master/src
> >
> > ​>I think you cannot add attachments to the mailing list. Could you
> upload
> > >your example somewhere and post a link here? I'm actually surprised that
> > >the while-loop works without problems.
> >
> > I have run the program on several simple tests, and I was going to try
> > large datasets in the next few days. Please, let me know if this approach
> > is wrong.
> >
> > Thanks,
> > Olga
> >
> > On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri <
> > vasilikikala...@gmail.com
> > > wrote:
> >
> > > Hi Olga,
> > >
> > > On 21 September 2016 at 18:50, Olga Golovneva 
> > wrote:
> > >
> > > > Hi devs,
> > > >
> > > > I was working on  (FLINK-1526) "Add Minimum Spanning Tree library
> > method
> > > > and example" issue. I've developed (Java) code that implements
> > > distributed
> > > > Boruvka's algorithm in Gelly library. I've run several tests and it
> > seems
> > > > to work fine, although I didn't test it on extremely large input
> graphs
> > > > yet, and I'm also trying to optimize my code.
> > > > Particularly, I have two main issues:
> > > >
> > > > 1. Nested loops.
> > > > I have to use nested loops, and I do not see the way to avoid them.
> As
> > > > they are currently not supported, I'm using Bulk Iterations inside a
> > > > "classic" while loop. I've included in attachment simple example
> > > > MyNestedIterationExample that shows this issue.
> > > >
> > >
> > > ​I think you cannot add attachments to the mailing list. Could you
> upload
> > > your example somewhere and post a link here? I'm actually surprised
> that
> > > the while-loop works without problems.
> > >
> > >
> > > >
> > > > 2. For some reason I cannot create class that works with types with
> > > > generic variables in Tuple2(or Tuple3), thus my code does not support
> > > > generic types. I also included simple example MyTuple3Example. Here
> is
> > > the
> > > > Exception I get:
> > > > "Exception in thread "main" org.apache.flink.api.common.functions.
> > > InvalidTypesException:
> > > > Type of TypeVariable 'EV' in 'class org.apache.flink.graph.
> > > > examples.MyTuple3Example$InitializeEdges' could not be determined.
> > This
> > > > is most likely a type erasure problem. The type extraction currently
> > > > supports types with generic variables only in cases where all
> variables
> > > in
> > > > the return type can be deduced from the input type(s)."
> > > >
> > >
> > > ​Can you upload this example and link to it too?
> > >
> > > Thanks,
> > > -Vasia.
> > >
> > >
> > > >
> > > > I would really appreciate if someone could explain me know how to
> avoid
> > > > this Exception. Otherwise, I could submit my code for testing.
> > > >
> > > > Best regards,
> > > > Olga Golovneva
> > > >
> > >
> >
>


Re: On (FLINK-1526) JIRA issue

2016-09-22 Thread Vasiliki Kalavri
Hi Olga,

when you use mapEdges() or mapVertices() with generics, Flink cannot
determine the type because of type erasure, like the exception says. That's
why we also provide methods that take the type information as a parameter.
You can use those to make the return type explicit. In your example, you
should do something like the following (line 41):

final TypeInformation longType = BasicTypeInfo.LONG_TYPE_INFO;
final TypeInformation doubleType = BasicTypeInfo.DOUBLE_TYPE_INFO;
Graph> graphOut =
graph.mapEdges(new InitializeEdges(), new
TupleTypeInfo(longType, longType,
new TupleTypeInfo>(doubleType,
longType,longType)));

Regarding the nested loops, I am almost sure that you will face problems if
you try to experiment with large datasets. I haven't looked into your code
yet, but according to the JIRA discussion, we've faced this problem before
and afaik, this is still an issue.

Cheers,
-Vasia.

On 22 September 2016 at 01:12, Olga Golovneva  wrote:

> Hi Vasia,
>
> I have uploaded these tests on github:
> https://github.com/OlgaGolovneva/MST/tree/master/tests
>
> I have also uploaded source code, but I'm still working on it:
> https://github.com/OlgaGolovneva/MST/tree/master/src
>
> ​>I think you cannot add attachments to the mailing list. Could you upload
> >your example somewhere and post a link here? I'm actually surprised that
> >the while-loop works without problems.
>
> I have run the program on several simple tests, and I was going to try
> large datasets in the next few days. Please, let me know if this approach
> is wrong.
>
> Thanks,
> Olga
>
> On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri <
> vasilikikala...@gmail.com
> > wrote:
>
> > Hi Olga,
> >
> > On 21 September 2016 at 18:50, Olga Golovneva 
> wrote:
> >
> > > Hi devs,
> > >
> > > I was working on  (FLINK-1526) "Add Minimum Spanning Tree library
> method
> > > and example" issue. I've developed (Java) code that implements
> > distributed
> > > Boruvka's algorithm in Gelly library. I've run several tests and it
> seems
> > > to work fine, although I didn't test it on extremely large input graphs
> > > yet, and I'm also trying to optimize my code.
> > > Particularly, I have two main issues:
> > >
> > > 1. Nested loops.
> > > I have to use nested loops, and I do not see the way to avoid them. As
> > > they are currently not supported, I'm using Bulk Iterations inside a
> > > "classic" while loop. I've included in attachment simple example
> > > MyNestedIterationExample that shows this issue.
> > >
> >
> > ​I think you cannot add attachments to the mailing list. Could you upload
> > your example somewhere and post a link here? I'm actually surprised that
> > the while-loop works without problems.
> >
> >
> > >
> > > 2. For some reason I cannot create class that works with types with
> > > generic variables in Tuple2(or Tuple3), thus my code does not support
> > > generic types. I also included simple example MyTuple3Example. Here is
> > the
> > > Exception I get:
> > > "Exception in thread "main" org.apache.flink.api.common.functions.
> > InvalidTypesException:
> > > Type of TypeVariable 'EV' in 'class org.apache.flink.graph.
> > > examples.MyTuple3Example$InitializeEdges' could not be determined.
> This
> > > is most likely a type erasure problem. The type extraction currently
> > > supports types with generic variables only in cases where all variables
> > in
> > > the return type can be deduced from the input type(s)."
> > >
> >
> > ​Can you upload this example and link to it too?
> >
> > Thanks,
> > -Vasia.
> >
> >
> > >
> > > I would really appreciate if someone could explain me know how to avoid
> > > this Exception. Otherwise, I could submit my code for testing.
> > >
> > > Best regards,
> > > Olga Golovneva
> > >
> >
>


Re: On (FLINK-1526) JIRA issue

2016-09-21 Thread Olga Golovneva
Hi Vasia,

I have uploaded these tests on github:
https://github.com/OlgaGolovneva/MST/tree/master/tests

I have also uploaded source code, but I'm still working on it:
https://github.com/OlgaGolovneva/MST/tree/master/src

​>I think you cannot add attachments to the mailing list. Could you upload
>your example somewhere and post a link here? I'm actually surprised that
>the while-loop works without problems.

I have run the program on several simple tests, and I was going to try
large datasets in the next few days. Please, let me know if this approach
is wrong.

Thanks,
Olga

On Wed, Sep 21, 2016 at 4:55 PM, Vasiliki Kalavri  wrote:

> Hi Olga,
>
> On 21 September 2016 at 18:50, Olga Golovneva  wrote:
>
> > Hi devs,
> >
> > I was working on  (FLINK-1526) "Add Minimum Spanning Tree library method
> > and example" issue. I've developed (Java) code that implements
> distributed
> > Boruvka's algorithm in Gelly library. I've run several tests and it seems
> > to work fine, although I didn't test it on extremely large input graphs
> > yet, and I'm also trying to optimize my code.
> > Particularly, I have two main issues:
> >
> > 1. Nested loops.
> > I have to use nested loops, and I do not see the way to avoid them. As
> > they are currently not supported, I'm using Bulk Iterations inside a
> > "classic" while loop. I've included in attachment simple example
> > MyNestedIterationExample that shows this issue.
> >
>
> ​I think you cannot add attachments to the mailing list. Could you upload
> your example somewhere and post a link here? I'm actually surprised that
> the while-loop works without problems.
>
>
> >
> > 2. For some reason I cannot create class that works with types with
> > generic variables in Tuple2(or Tuple3), thus my code does not support
> > generic types. I also included simple example MyTuple3Example. Here is
> the
> > Exception I get:
> > "Exception in thread "main" org.apache.flink.api.common.functions.
> InvalidTypesException:
> > Type of TypeVariable 'EV' in 'class org.apache.flink.graph.
> > examples.MyTuple3Example$InitializeEdges' could not be determined. This
> > is most likely a type erasure problem. The type extraction currently
> > supports types with generic variables only in cases where all variables
> in
> > the return type can be deduced from the input type(s)."
> >
>
> ​Can you upload this example and link to it too?
>
> Thanks,
> -Vasia.
>
>
> >
> > I would really appreciate if someone could explain me know how to avoid
> > this Exception. Otherwise, I could submit my code for testing.
> >
> > Best regards,
> > Olga Golovneva
> >
>


Re: On (FLINK-1526) JIRA issue

2016-09-21 Thread Vasiliki Kalavri
Hi Olga,

On 21 September 2016 at 18:50, Olga Golovneva  wrote:

> Hi devs,
>
> I was working on  (FLINK-1526) "Add Minimum Spanning Tree library method
> and example" issue. I've developed (Java) code that implements distributed
> Boruvka's algorithm in Gelly library. I've run several tests and it seems
> to work fine, although I didn't test it on extremely large input graphs
> yet, and I'm also trying to optimize my code.
> Particularly, I have two main issues:
>
> 1. Nested loops.
> I have to use nested loops, and I do not see the way to avoid them. As
> they are currently not supported, I'm using Bulk Iterations inside a
> "classic" while loop. I've included in attachment simple example
> MyNestedIterationExample that shows this issue.
>

​I think you cannot add attachments to the mailing list. Could you upload
your example somewhere and post a link here? I'm actually surprised that
the while-loop works without problems.


>
> 2. For some reason I cannot create class that works with types with
> generic variables in Tuple2(or Tuple3), thus my code does not support
> generic types. I also included simple example MyTuple3Example. Here is the
> Exception I get:
> "Exception in thread "main" 
> org.apache.flink.api.common.functions.InvalidTypesException:
> Type of TypeVariable 'EV' in 'class org.apache.flink.graph.
> examples.MyTuple3Example$InitializeEdges' could not be determined. This
> is most likely a type erasure problem. The type extraction currently
> supports types with generic variables only in cases where all variables in
> the return type can be deduced from the input type(s)."
>

​Can you upload this example and link to it too?

Thanks,
-Vasia.


>
> I would really appreciate if someone could explain me know how to avoid
> this Exception. Otherwise, I could submit my code for testing.
>
> Best regards,
> Olga Golovneva
>


On (FLINK-1526) JIRA issue

2016-09-21 Thread Olga Golovneva
Hi devs,

I was working on  (FLINK-1526) "Add Minimum Spanning Tree library method
and example" issue. I've developed (Java) code that implements distributed
Boruvka's algorithm in Gelly library. I've run several tests and it seems
to work fine, although I didn't test it on extremely large input graphs
yet, and I'm also trying to optimize my code.
Particularly, I have two main issues:

1. Nested loops.
I have to use nested loops, and I do not see the way to avoid them. As they
are currently not supported, I'm using Bulk Iterations inside a "classic"
while loop. I've included in attachment simple example
MyNestedIterationExample that shows this issue.

2. For some reason I cannot create class that works with types with generic
variables in Tuple2(or Tuple3), thus my code does not support generic
types. I also included simple example MyTuple3Example. Here is the
Exception I get:
"Exception in thread "main"
org.apache.flink.api.common.functions.InvalidTypesException: Type of
TypeVariable 'EV' in 'class
org.apache.flink.graph.examples.MyTuple3Example$InitializeEdges' could not
be determined. This is most likely a type erasure problem. The type
extraction currently supports types with generic variables only in cases
where all variables in the return type can be deduced from the input
type(s)."

I would really appreciate if someone could explain me know how to avoid
this Exception. Otherwise, I could submit my code for testing.

Best regards,
Olga Golovneva