[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906113#comment-14906113
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1149


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900354#comment-14900354
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/1149#discussion_r39947900
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,486 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.api.java.ExecutionEnvironment;
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
optional vertex and edge data.
+ * The class also configures the CSV readers used to read edge and vertex 
data such as the field types,
+ * the delimiters (row and field), the fields that should be included or 
skipped, and other flags,
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in the {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+public class GraphCsvReader {
+
+   @SuppressWarnings("unused")
+   private final Path vertexPath, edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader edgeReader;
+   protected CsvReader vertexReader;
+   protected MapFunction mapper;
+   protected Class vertexKey;
+   protected Class vertexValue;
+   protected Class edgeValue;
+

+//
+   public GraphCsvReader(Path vertexPath, Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.vertexReader = new CsvReader(vertexPath, context);
+   this.edgeReader = new CsvReader(edgePath, context);
+   this.mapper = null;
+   this.executionContext = context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.edgeReader = new CsvReader(edgePath, context);
+   this.vertexReader = null;
+   this.mapper = null;
+   this.executionContext = context;
+   }
+
+   public  GraphCsvReader(Path edgePath, final MapFunction 
mapper, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.edgeReader = new CsvReader(edgePath, context);
+   this.vertexReader = null;
+   this.mapper = mapper;
+   this.executionContext = context;
+   }
+
+   public GraphCsvReader (String edgePath, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, "The file 
path may not be null.")), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, "The file 
path may not be null.")),
+   new Path(Preconditions.checkNotNull(edgePath, 
"The file path may not be null.")), context);
+   }
+
+
+   public  GraphCsvReader(String edgePath, final MapFunction 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, "The 
file path may not be 

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900361#comment-14900361
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/1149#discussion_r39948297
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/IncrementalSSSP.java
 ---
@@ -110,24 +105,20 @@ public static void main(String [] args) throws 
Exception {
// Emit results
if(fileOutput) {
resultedVertices.writeAsCsv(outputPath, "\n", 
",");
-
-   // since file sinks are lazy, we trigger the 
execution explicitly
-   env.execute("Incremental SSSP Example");
} else {
resultedVertices.print();
}
 
+   env.execute("Incremental SSSP Example");
--- End diff --

I'm not sure whether I am missing something... Why do you add 
`env.execute()` after `print()`. 
It's no longer needed. Have a look here:

https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/graph/PageRankBasic.java
 


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900370#comment-14900370
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1149#discussion_r39948730
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/IncrementalSSSP.java
 ---
@@ -110,24 +105,20 @@ public static void main(String [] args) throws 
Exception {
// Emit results
if(fileOutput) {
resultedVertices.writeAsCsv(outputPath, "\n", 
",");
-
-   // since file sinks are lazy, we trigger the 
execution explicitly
-   env.execute("Incremental SSSP Example");
} else {
resultedVertices.print();
}
 
+   env.execute("Incremental SSSP Example");
--- End diff --

That's result of auto-merge I guess. Thanks for spotting it!


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900366#comment-14900366
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/1149#issuecomment-141905104
  
Hi @vasia,

As you said, I already reviewed this :P. I left a couple of comments 
inline. Please reverify the forwarded fields annotations. If you put them there 
for one mapper, add them for the others too. 

Appart from that, it's good to merge.  


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900373#comment-14900373
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1149#issuecomment-141905875
  
Thanks @andralungu! I'll address your comments and merge later.


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900368#comment-14900368
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1149#discussion_r39948692
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,486 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.api.java.ExecutionEnvironment;
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
optional vertex and edge data.
+ * The class also configures the CSV readers used to read edge and vertex 
data such as the field types,
+ * the delimiters (row and field), the fields that should be included or 
skipped, and other flags,
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in the {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+public class GraphCsvReader {
+
+   @SuppressWarnings("unused")
+   private final Path vertexPath, edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader edgeReader;
+   protected CsvReader vertexReader;
+   protected MapFunction mapper;
+   protected Class vertexKey;
+   protected Class vertexValue;
+   protected Class edgeValue;
+

+//
+   public GraphCsvReader(Path vertexPath, Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.vertexReader = new CsvReader(vertexPath, context);
+   this.edgeReader = new CsvReader(edgePath, context);
+   this.mapper = null;
+   this.executionContext = context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.edgeReader = new CsvReader(edgePath, context);
+   this.vertexReader = null;
+   this.mapper = null;
+   this.executionContext = context;
+   }
+
+   public  GraphCsvReader(Path edgePath, final MapFunction 
mapper, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.edgeReader = new CsvReader(edgePath, context);
+   this.vertexReader = null;
+   this.mapper = mapper;
+   this.executionContext = context;
+   }
+
+   public GraphCsvReader (String edgePath, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, "The file 
path may not be null.")), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, "The file 
path may not be null.")),
+   new Path(Preconditions.checkNotNull(edgePath, 
"The file path may not be null.")), context);
+   }
+
+
+   public  GraphCsvReader(String edgePath, final MapFunction 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, "The 
file path may not be null.")), 

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877297#comment-14877297
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

GitHub user vasia opened a pull request:

https://github.com/apache/flink/pull/1149

[FLINK-1520] [gelly] Create a Graph from CSV files

This builds on @shghatge's work in #847.
I addressed the remaining issues, rebased, and edited the docs.
@andralungu, you've already reviewed this, but if you could give it one 
more look, that'd be great :)
Thanks!

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vasia/flink csvInput

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1149


commit 46f52ae64664be39f73af2505e5ded5e9736a867
Author: Shivani 
Date:   2015-06-17T13:37:36Z

[FLINK-1520] [gelly] Read edges and vertices from CSV files

commit ab114f39e9f1f21802ca63c8bb186f1015b8f460
Author: Shivani 
Date:   2015-07-06T13:41:59Z

[FLINK-1520][gelly]Changed the methods for specifying types. Created a new 
file for tests. Made appropriate changes in gelly_guide.md

commit 8a0b66489407de9aec84c3b715aded7225772ee4
Author: vasia 
Date:   2015-07-14T18:46:33Z

[FLINK-1520] [gelly] types and formatting changes to the graph csv reader

commit 8007acbf06649694429be189bab70aa451cee679
Author: vasia 
Date:   2015-07-27T13:43:59Z

[FLINK-1520] [gelly] added named types methods for reading a Graph from CSV 
input, with and without vertex/edge values. Changes the examples and the tests 
accordingly.

commit 9d02c2baba817948ff8710d2a2ae2dda752bff48
Author: vasia 
Date:   2015-09-19T19:18:53Z

[FLINK-1520] [gelly] corrections in Javadocs; updated documentation




> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Vasia Kalavri
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725140#comment-14725140
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge closed the pull request at:

https://github.com/apache/flink/pull/847


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Shivani Ghatge
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-09-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725142#comment-14725142
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-13651
  
@vasia  It is fine with me.


> Read edges and vertices from CSV files
> --
>
> Key: FLINK-1520
> URL: https://issues.apache.org/jira/browse/FLINK-1520
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Shivani Ghatge
>Priority: Minor
>  Labels: easyfix, newbie
>
> Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-08-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708092#comment-14708092
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-133727559
  
Thanks for the comments @andralungu!
@shghatge, can you please close this PR? I will make the docs update and 
open a new one, which will include your work and my changes if that's OK with 
you. Thank you!


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682279#comment-14682279
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-130013958
  
Hi @vasia, 

Not sure whether this comment was issued for me... Nevertheless I left some 
suggestions inline. All in all, it covers the problems discussed in the 73! 
comments here. You forgot to properly document the edgeTypes(K, EV), etc 
methods. 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682282#comment-14682282
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-130014300
  
Saw this I will also update the documentation  afterwards... Sorry!


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635396#comment-14635396
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-123402206
  
I see your point @shghatge. 
However, I think naming just one method differently will be confusing..
If we're going to have custom method names, let's go with @andralungu's 
suggestion above and make sure we document these properly.
I would prefer a bit shorter method names though.
How about:
1). `keyType(K)`
2). `vertexTypes(K, VV)`
3). `edgeTypes(K, EV)`
4). `types(K, VV, EV)`
?


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634142#comment-14634142
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-123057981
  
The only problem with assuming NullValue if a value is missing is that we 
can't return NullValue in place of VV.
I mean to say GraphK, VV, EV in this VV or EV can't be NullValue. 
otherwise that was what I was originally going for. 
Maybe since any of the other methods to create DataSet/Graph don't provide 
a method to give EdgeValue as NullValue and just expect the user to map it (at 
least that is what I saw), maybe we could just remove the functionality. I had 
only added it since many examples seemed to use it so I thought it would be 
nice to have that functionality. 
In any case we can just keep one typesNullEdge method too because if they 
don't want that, they can use normal overloaded types, 3 arguments for no 
NullValue, 2 arguments for null vertex and 1 argument for null vertex and edge 
and just one method named typesNullEdge to tell that only edges have NullValue.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626259#comment-14626259
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121215586
  
yes, I mean `NullValue.class` :)
I'd like to know @shghatge's opinion, too!


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14626254#comment-14626254
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121215203
  
Hmmm :-? but can you pass NullValue to tyes... it expects Something.class. 
Can it be overwritten without type erasure getting in the way? 

Anyway... I will let @shghatge take over from here :) 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14624390#comment-14624390
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-120856171
  
Hi,

I just had a closer look at this PR and it made me seriously question the 
utility of a `Graph.fromCSV` method. Why? First of all because it's more 
limited than the regular `env.fromCsv()` in the sense that it does not allow 
POJOs and it would be a bit tedious to support that. There would be a need for 
methods with 2 to n fields, according to the amount of attributes present in 
the POJO. 

Second, because, and I am speaking strictly as a user here, I would rather 
write:
private static DataSetEdgeLong, Double 
getEdgesDataSet(ExecutionEnvironment env) {

if(fileOutput) {
return env.readCsvFile(edgeInputPath)
.ignoreComments(#)
.fieldDelimiter(\t)
.lineDelimiter(\n)
.types(Long.class, Long.class, 
Double.class)
.map(new Tuple3ToEdgeMapLong, 
Double());
} else {
return 
CommunityDetectionData.getDefaultEdgeDataSet(env);
}
}

than...

private static GraphLong, Long, Double getGraph(ExecutionEnvironment env) 
{
GraphLong, Long, Double graph;
if(!fileOutput) {
DataSetEdgeLong, Double edges = 
CommunityDetectionData.getDefaultEdgeDataSet(env);
graph = Graph.fromDataSet(edges,
new MapFunctionLong, Long() {

public Long map(Long label) {
return label;
}
}, env);
} else {
graph = Graph.fromCsvReader(edgeInputPath,new 
MapFunctionLong, Long() {
public Long map(Long label) {
return label;
}
}, env).ignoreCommentsEdges(#)
.fieldDelimiterEdges(\t)
.lineDelimiterEdges(\n)
.typesEdges(Long.class, Double.class)
.typesVertices(Long.class, Long.class);
}
return graph;
}

Maybe it's just a preference thing... but I believe it's at least worth a 
discussion. On the other hand, the utility of such a method should have been 
questioned from its early Jira days, so I guess that's my mistake.

I would like to hear your thoughts on this. 
Thanks!


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625209#comment-14625209
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121033796
  
I hadn't realized that they would both need to be called in my previous 
comment, my bad.
Any idea for decent method names? `typesNoEdgeValue` and 
`typesNoVertexValue` seem really ugly to me :S


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625201#comment-14625201
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121032514
  
Hi @vasia, 

I also saw the types issue, but I had a feeling that this is the way it was 
decided in the previous comment. I would rather have different names for 2 and 
3 than to force a call to `typeVertices` if it's not needed.  


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625177#comment-14625177
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121025189
  
Hi @andralungu,

do you mean support for POJOs as vertex / edge values?
I guess that's a limitation we can't easily overcome, I agree.
Still though, a nicely designed `fromCsv()` method would simplify the 
common case.

As for the examples, I don't like what they currently look like in this PR 
either. However, that's not a problem of `fromCsv()`. The if-block can be 
easily simplified by changing `getDefaultEdgeDataSet` to `getDefaultGraph`. The 
else-block looks longer because of the mapper, which, in the current examples 
is in the main method.

What I think is quite problematic, is the `types()` methods. Ideally, we 
would have the following:
1. `types(K)` : no vertex value, no edge value
2. `types(K, VV)`: no edge value
3. `types(K, EV)`: no vertex value
4. `types(K, VV, EV)`: both vertex and edge values are present
However, because of type erasure, we can't have both 2 and 3. The current 
implementation (having separate `typesEdges` and `typesVertices`) means that 
both should always be called, even if not necessary. Another way would be to 
give 2 and 3 different names... So far I haven't been able to come up with a 
nice solution. Ideas?


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625250#comment-14625250
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121039562
  
Yes, but then you would have the following methods: `types`, 
`typesNoEdgeValue`, `typesNoVertexValue` and again `types`. So, even if it's 
not 100% needed I'd try to keep it consistent. We could also make it more 
graph-oriented (the name `types` was generic). The following is just an example:
1). keyType(K) 
2). keyAndVertexTypes(K, VV)
3). keyAndEdgeTypes(K, EV)
4). keyVertexAndEdgeTypes(K, VV, EV)

With a nice documentation, I think I'd understand what these are for :) 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625246#comment-14625246
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r34504919
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+@SuppressWarnings({unused , unchecked})
+public class GraphCsvReaderK,VV,EV {
+
+   private final Path vertexPath,edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+   protected ClassK vertexKey;
+   protected ClassVV vertexValue;
+   protected ClassEV edgeValue;
+

+//
+   public GraphCsvReader(Path vertexPath,Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.VertexReader = new CsvReader(vertexPath,context);
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath,final MapFunctionK, VV mapper, 
ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String edgePath,ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The file 
path may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, The file 
path may not be null.)),
+   new Path(Preconditions.checkNotNull(edgePath, 
The file path may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String edgePath, final MapFunctionK, VV 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)),mapper, context);
+   }
+

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625244#comment-14625244
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r34504887
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+@SuppressWarnings({unused , unchecked})
+public class GraphCsvReaderK,VV,EV {
+
+   private final Path vertexPath,edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+   protected ClassK vertexKey;
+   protected ClassVV vertexValue;
+   protected ClassEV edgeValue;
+

+//
+   public GraphCsvReader(Path vertexPath,Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.VertexReader = new CsvReader(vertexPath,context);
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath,final MapFunctionK, VV mapper, 
ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String edgePath,ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The file 
path may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, The file 
path may not be null.)),
+   new Path(Preconditions.checkNotNull(edgePath, 
The file path may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String edgePath, final MapFunctionK, VV 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)),mapper, context);
+   }
+

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625425#comment-14625425
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-121072347
  
Still an overkill I think... Could another way be to have only `types(K, 
VV, EV)` with all 3 arguments and expect `NullValue` if a value is missing?


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619182#comment-14619182
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-119698681
  
Updated PR


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612516#comment-14612516
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33822653
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,471 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV {
+
+   private final Path vertexPath,edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path vertexPath,Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.VertexReader = new CsvReader(vertexPath,context);
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath,final MapFunctionK, VV mapper, 
ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String edgePath,ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The file 
path may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, The file 
path may not be null.)),new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String edgePath, final MapFunctionK, VV 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)),mapper, context);
+   }
+
+   public CsvReader getEdgeReader() {
+   return this.EdgeReader;
+   }
+
+   public CsvReader getVertexReader() {
+   return this.VertexReader;
   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612532#comment-14612532
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33823666
  
--- Diff: 
flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/operations/GraphCreationITCase.java
 ---
@@ -54,16 +75,13 @@ public void testCreateWithoutVertexValues() throws 
Exception {
final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
GraphLong, NullValue, Long graph = 
Graph.fromDataSet(TestGraphUtils.getLongLongEdgeData(env), env);
 
-DataSetVertexLong,NullValue data = graph.getVertices();
-ListVertexLong,NullValue result= data.collect();
-
+   graph.getVertices().writeAsCsv(resultPath);
--- End diff --

hmm it seems you're reverting the changes of #863?


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612535#comment-14612535
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33823720
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,471 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV {
+
+   private final Path vertexPath,edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path vertexPath,Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.VertexReader = new CsvReader(vertexPath,context);
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath,final MapFunctionK, VV mapper, 
ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String edgePath,ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The file 
path may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, The file 
path may not be null.)),new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String edgePath, final MapFunctionK, VV 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)),mapper, context);
+   }
+
+   public CsvReader getEdgeReader() {
+   return this.EdgeReader;
+   }
+
+   public CsvReader getVertexReader() {
+   return this.VertexReader;

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612538#comment-14612538
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33823858
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,471 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV {
+
+   private final Path vertexPath,edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path vertexPath,Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.VertexReader = new CsvReader(vertexPath,context);
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath,final MapFunctionK, VV mapper, 
ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String edgePath,ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The file 
path may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, The file 
path may not be null.)),new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String edgePath, final MapFunctionK, VV 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)),mapper, context);
+   }
+
+   public CsvReader getEdgeReader() {
+   return this.EdgeReader;
+   }
+
+   public CsvReader getVertexReader() {
+   return this.VertexReader;

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612537#comment-14612537
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33823849
  
--- Diff: 
flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/operations/GraphCreationWithMapperITCase.java
 ---
@@ -52,16 +72,13 @@ public void testWithDoubleValueMapper() throws 
Exception {
GraphLong, Double, Long graph = 
Graph.fromDataSet(TestGraphUtils.getLongLongEdgeData(env),
new AssignDoubleValueMapper(), env);
 
-DataSetVertexLong,Double data = graph.getVertices();
-ListVertexLong,Double result= data.collect();
-   
+   graph.getVertices().writeAsCsv(resultPath);
--- End diff --

Same here.. We changed the tests to use `collect()` instead of files in 
#863. Please don't change it back ;)


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612559#comment-14612559
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-118173012
  
Hi @shghatge! Thank you for the update :)

I left some comments inline. There are still some formatting issues in the 
code. Please, carefully go through your changes and try to be consistent. Also, 
there are still several warning regarding types, unused annotations, unused 
variables. Can you please try to remove them? Your IDE should have a setting 
that gives you the list of warnings.

Regarding the tests, better create new test files for your methods, since 
you need to test with files and currently other tests use `collect()`.

Finally, I find the `types()` methods a bit confusing. Could we maybe have 
separate types methods for the vertices and edges? e.g. `typesEdges(keyType, 
valueType)`, `typesEdges(keyType)`, `typesVertices(keyType, valueType)` and 
`typesVertices(keyType)`?



 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612513#comment-14612513
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33822502
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,471 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV {
+
+   private final Path vertexPath,edgePath;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path vertexPath,Path edgePath, 
ExecutionEnvironment context) {
+   this.vertexPath = vertexPath;
+   this.edgePath = edgePath;
+   this.VertexReader = new CsvReader(vertexPath,context);
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath, ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path edgePath,final MapFunctionK, VV mapper, 
ExecutionEnvironment context) {
+   this.vertexPath = null;
+   this.edgePath = edgePath;
+   this.EdgeReader = new CsvReader(edgePath,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String edgePath,ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The file 
path may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String vertexPath, String edgePath, 
ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(vertexPath, The file 
path may not be null.)),new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String edgePath, final MapFunctionK, VV 
mapper, ExecutionEnvironment context) {
+   this(new Path(Preconditions.checkNotNull(edgePath, The 
file path may not be null.)),mapper, context);
+   }
+
+   public CsvReader getEdgeReader() {
+   return this.EdgeReader;
+   }
+
+   public CsvReader getVertexReader() {
+   return this.VertexReader;
   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14612545#comment-14612545
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33824099
  
--- Diff: 
flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/operations/GraphCreationITCase.java
 ---
@@ -54,16 +75,13 @@ public void testCreateWithoutVertexValues() throws 
Exception {
final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
GraphLong, NullValue, Long graph = 
Graph.fromDataSet(TestGraphUtils.getLongLongEdgeData(env), env);
 
-DataSetVertexLong,NullValue data = graph.getVertices();
-ListVertexLong,NullValue result= data.collect();
-
+   graph.getVertices().writeAsCsv(resultPath);
--- End diff --

Oh... I made these changes before that pull request got merged. I change it 
now.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610786#comment-14610786
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-117787551
  
Nice and rebased. +1


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-07-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610529#comment-14610529
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-117728162
  
Updated PR


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602859#comment-14602859
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354597
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
--- End diff --

again the bracket issue :)


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602861#comment-14602861
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354654
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunctionK, VV mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
--- End diff --

here it's this,path1 = null; for consistency with the rest.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602863#comment-14602863
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354695
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunctionK, VV mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunctionK, VV mapper, 
ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   return this.EdgeReader;
+   }
+
+   public CsvReader getVertexReader()
+   {
+   return this.VertexReader;
+   }
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602879#comment-14602879
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-115689621
  
Hi @shghatge ,

I left my set of comments inline. They are mostly related to coding style 
issues. I guess you should revisit the previous comments here.

Also, don't forget to rebase. It seems like there are some merge conflicts 
that need to be fixed :)



 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602850#comment-14602850
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33353966
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,54 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
--- End diff --

there is a trailing from Tuple3 here...


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602858#comment-14602858
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354571
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
--- End diff --

again, the path1, path2 issue


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602872#comment-14602872
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33355251
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/GraphMetrics.java
 ---
@@ -150,20 +149,15 @@ private static boolean parseParameters(String[] args) 
{
}
 
@SuppressWarnings(serial)
-   private static DataSetEdgeLong, NullValue 
getEdgesDataSet(ExecutionEnvironment env) {
-   if (fileOutput) {
-   return env.readCsvFile(edgesInputPath)
-   
.lineDelimiter(\n).fieldDelimiter(\t)
-   .types(Long.class, Long.class).map(
-   new 
MapFunctionTuple2Long, Long, EdgeLong, NullValue() {
-
-   public 
EdgeLong, NullValue map(Tuple2Long, Long value) {
-   return 
new EdgeLong, NullValue(value.f0, value.f1, 
-   
NullValue.getInstance());
-   }
-   });
-   } else {
-   return ExampleUtils.getRandomEdges(env, NUM_VERTICES);
+   private static GraphLong, NullValue, NullValue 
getGraph(ExecutionEnvironment env) {
+   if(fileOutput) {
+   return Graph.fromCsvReader(edgesInputPath, 
env).lineDelimiterEdges(\n).fieldDelimiterEdges(\t)
+   
.types(Long.class);
+
+   }
+   else
+   {
+   return 
Graph.fromDataSet(ExampleUtils.getRandomEdges(env, NUM_VERTICES), env);
--- End diff --

Yup... so I like how this looks better than how the previous rewritings 
were made...


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602854#comment-14602854
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354309
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
--- End diff --

let's not call these path1 and path2. I suggest we use better names like 
edgePath, vertexPath... This is valid for the methods underneath too...


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602852#comment-14602852
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354071
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,54 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param verticesPath path to a CSV file with the Vertices data.
+   * @param edgesPath path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
--- End diff --

on which calling not which on ... or which on calling the types method 
specifies (not to specify)


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602857#comment-14602857
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354535
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,502 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.types.NullValue;
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+public class GraphCsvReaderK,VV,EV{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunctionK, VV mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
--- End diff --

Also, @vasia was talking about some general coding style rules in one of 
the previous comments... The way we add the opening block brackets must be 
consistent. So here, after public GraphCsvReader(...) { //open the bracket on 
the same line.

Please look in the rest of the document for similar issues...  


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602873#comment-14602873
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33355373
  
--- Diff: 
flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/operations/GraphCreationWithMapperITCase.java
 ---
@@ -156,4 +181,17 @@ public DummyCustomType map(Long vertexId) {
return new DummyCustomType(vertexId.intValue()-1, 
false);
}
}
+
+   private FileInputSplit createTempFile(String content) throws 
IOException {
+   File tempFile = File.createTempFile(test_contents, tmp);
+   tempFile.deleteOnExit();
--- End diff --

`deleteOnExit()`... nice!


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602853#comment-14602853
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33354136
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,54 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param verticesPath path to a CSV file with the Vertices data.
+   * @param edgesPath path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+   public static  GraphCsvReader fromCsvReader(String verticesPath, String 
edgesPath, ExecutionEnvironment context) {
+   return new GraphCsvReader(verticesPath, edgesPath, context);
+   }
+   /** Creates a graph from a CSV file for Edges., Vertices are
--- End diff --

... Edges. \n (right now it\s .,)
Vertices



 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602871#comment-14602871
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r33355180
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/ConnectedComponents.java
 ---
@@ -119,23 +113,32 @@ private static boolean parseParameters(String [] 
args) {
return true;
}
 
-   @SuppressWarnings(serial)
-   private static DataSetEdgeLong, NullValue 
getEdgesDataSet(ExecutionEnvironment env) {
-
-   if(fileOutput) {
-   return env.readCsvFile(edgeInputPath)
-   .ignoreComments(#)
-   .fieldDelimiter(\t)
-   .lineDelimiter(\n)
-   .types(Long.class, Long.class)
-   .map(new MapFunctionTuple2Long, 
Long, EdgeLong, NullValue() {
-   @Override
-   public EdgeLong, NullValue 
map(Tuple2Long, Long value) throws Exception {
-   return new EdgeLong, 
NullValue(value.f0, value.f1, NullValue.getInstance());
+   private static GraphLong, Long, NullValue 
getGraph(ExecutionEnvironment env)
+   {
+   GraphLong, Long, NullValue graph;
+   if(!fileOutput)
+   {
+   DataSetEdgeLong, NullValue edges = 
ConnectedComponentsDefaultData.getDefaultEdgeDataSet(env);
--- End diff --

Let's also keep this consistent. In Single Source Shortest Paths you read 
fromDataSet(getDefault..., env). maybe we could do that for all the examples 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599924#comment-14599924
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-114975168
  
Updated PR


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593342#comment-14593342
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-113483077
  
Mea culpa. No the mapper test is fine;

For the examples comment, I meant to go through the classes in the example 
folder and to modify the way the graph is currently read. Right now, we fetch 
the edges via `env.fromCsv` and then use `Graph.fromDataSet` to create the 
graph. We should do it directly via Graph.fromCsv. 

The example in the docs is fine, because it explains how fromDataSet works. 
That is still available. 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593241#comment-14593241
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-113439469
  
Hello @Andra
There is one test for fromCsv with mapper in 
GraphCreationWithMapperITCase.java
Should I add more tests for that?

Also for the examples comment, do you mean that I should update the Gelly 
guide by removing the examples for Csv file which use env.readCsvFile();?

I will add the other tests. :)


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591827#comment-14591827
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-113165614
  
Updated PR


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592072#comment-14592072
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-113213190
  
This looks very nice! Someone deserves a virtual :ice_cream: ! 

There are some tests missing: 
- test `fromCSV` with a Mapper
- you just test `types`, `ignoreFirstLines` and `ignoreComments`; let's at 
least add tests for the `lineDelimiter*` and the `fieldDelimiter*` methods. I'm 
sure they work, but tests are written to guarantee that the functionality will 
also be there (at the same quality) in the future (i.e. some exotic code 
addition will not break it) :)

I saw an outdated Vasia comment on an unused import; always hit mvn verify 
before pushing - it would have caught that :D 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592090#comment-14592090
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-113217602
  
Ah! And I just remembered! Maybe it makes sense to update the examples to 
use `fromCSV` when creating the Graph instead of `getEdgesDataSet`. 


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591644#comment-14591644
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-113107660
  
Hello @vasia 
I will follow the guidelines and add the tests that are suggested by you 
when making a commit.
For the separate configuration methods issue, I was thinking more along the 
lines that if we want to configure the readers separately, then we could use 
the get methods for the CsvReaders and then configure them. But I will add the 
separate method now.

Thanks for the detailed guidance.  :)


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590556#comment-14590556
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32672805
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static  GraphCsvReader fromCsvReader(String path1, String path2, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path1,path2,context));
--- End diff --

parentheses not needed


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590559#comment-14590559
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32672995
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static  GraphCsvReader fromCsvReader(String path1, String path2, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path1,path2,context));
+   }
+   /** Creates a graph from a CSV file for Edges., Vertices are
+   * induced from the edges.
+   *
+   * Edges with value are created from a CSV file with 3 fields. Vertices 
are created
+   * automatically and their values are set to NullValue.
+   *
+   * @param path a path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   * Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static GraphCsvReader fromCsvReader(String path, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path,context));
+   }
+
+   /**
+*Creates a graph from a CSV file for Edges., Vertices are
+* induced from the edges and vertex values are calculated by a mapper
+* function.  Edges with value are created from a CSV file with 3 
fields.
+* Vertices are created automatically and their values are set by 
applying the provided map
+* function to the vertex ids.
+*
+* @param path a path to a CSV file with the Edges data
+* @param mapper the mapper function.
+* @param context the flink execution environment.
+* @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+* Vertex ID, Vertex Value and Edge value returns a Graph
+*/
+
+   public static GraphCsvReader fromCsvReader(String path, final 
MapFunction mapper,ExecutionEnvironment context)
+   {
+   return (new GraphCsvReader(path,mapper,context));
--- End diff --

same applies here :)


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590558#comment-14590558
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32672972
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static  GraphCsvReader fromCsvReader(String path1, String path2, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path1,path2,context));
+   }
+   /** Creates a graph from a CSV file for Edges., Vertices are
+   * induced from the edges.
+   *
+   * Edges with value are created from a CSV file with 3 fields. Vertices 
are created
+   * automatically and their values are set to NullValue.
+   *
+   * @param path a path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   * Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static GraphCsvReader fromCsvReader(String path, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path,context));
--- End diff --

Parentheses here too.
Also, it's nice to have a space after commas when separating arguments and 
a space before the curly bracket that defines the start of the method.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590580#comment-14590580
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673548
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
--- End diff --

add type arguments to MapFunction


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590690#comment-14590690
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32679424
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590591#comment-14590591
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674326
  
--- Diff: docs/libs/gelly_guide.md ---
@@ -102,6 +102,15 @@ DataSetTuple3String, String, Double edgeTuples = 
env.readCsvFile(path/to/ed
 GraphString, Long, Double graph = Graph.fromTupleDataSet(vertexTuples, 
edgeTuples, env);
 {% endhighlight %}
 
+* from a CSV file with three fields and an optional CSV file with 2 
fields. In this case, Gelly will convert each row from the first CSV file to an 
`Edge`, where the first field will be the source ID, the second field will be 
the target ID and the third field will be the edge value. Equivalently, each 
row from the second CSV file will be converted to a `Vertex`, where the first 
field will be the vertex ID and the second field will be the vertex value. A 
types() method is called on the GraphCsvReader object returned by 
fromCsvReader() to inform the CsvReader of the types of the fields :
--- End diff --

oh! Will fix this.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590599#comment-14590599
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674875
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590607#comment-14590607
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32675011
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590606#comment-14590606
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674993
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590605#comment-14590605
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674967
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590601#comment-14590601
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674908
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590557#comment-14590557
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32672827
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static  GraphCsvReader fromCsvReader(String path1, String path2, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path1,path2,context));
+   }
+   /** Creates a graph from a CSV file for Edges., Vertices are
+   * induced from the edges.
+   *
+   * Edges with value are created from a CSV file with 3 fields. Vertices 
are created
+   * automatically and their values are set to NullValue.
+   *
+   * @param path a path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   * Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
--- End diff --

new line


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590554#comment-14590554
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32672778
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
--- End diff --

remove new line


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590563#comment-14590563
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673105
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
--- End diff --

remove new lines


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590609#comment-14590609
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32675117
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590623#comment-14590623
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32676277
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590627#comment-14590627
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32676617
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
--- End diff --

I tried adding the function with types but it is giving error. But the 
methods for the mapper are tested and are working.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590633#comment-14590633
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-112954221
  
Hey @shghatge,

this is a great first try, you got the logic right and I really like the 
detailed javadocs ^^
I left a few inline comments, which should be easy to fix.

Let me also elaborate a bit on some general guidelines:
- Code formatting: we don't really have a strict Java code style, but there 
a few things you can improve. For your code to be readable, it is nice to leave 
a space after the commas separating arguments. For example `myMethod(arg1, 
arg2, arg3)` instead of `myMethod(arg1,arg2,arg3)`.
We usually separate the closing of a parenthesis and the opening of a curly 
bracket with a space, i.e. `myMethod() { ... }` instead of  `myMethod(){ ... }`.
Also, try to avoid adding new lines if they are not needed.
Regarding the types missing, this is not creating an error, but gives a 
warning. You can turn on warning notification settings in your IDE to avoid 
this.

- I like it that you added separate methods `includeFields` methods` for 
vertices and edges. It would probably make sense to do the same for the rest of 
the methods. For example, you might want to skip the first line in the edges 
file, but not in the vertices file. Right now, you are forced to either do both 
or none. Alternatively, we could add parameters to the existing methods, to 
define the behavior for edges and vertices files separately. For example 
`public GraphCsvReader lineDelimiter(String VertexDelimiter, EdgeDelimiter)`. 
What do you think?

- Finally, in order to catch issues like the one with the null 
`VertexReader`, you should always try to test as much functionality you have 
added as possible. In this case, it would be a good idea to add a test reading 
from edges only and some tests for the different methods you have added.

Let me know if you have questions!



 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590561#comment-14590561
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673063
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static  GraphCsvReader fromCsvReader(String path1, String path2, 
ExecutionEnvironment context){
--- End diff --

I would rename `path1` and `path2` to something like `verticesPath` and 
`edgesPath`


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590603#comment-14590603
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674930
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590604#comment-14590604
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674944
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590602#comment-14590602
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32674923
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590575#comment-14590575
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673330
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -282,6 +282,58 @@ public void flatMap(EdgeK, EV edge, 
CollectorTuple1K out) {
}
 
/**
+   * Creates a graph from CSV files.
+   *
+   * Vertices with value are created from a CSV file with 2 fields
+   * Edges with value are created from a CSV file with 3 fields
+   * from Tuple3.
+   *
+   * @param path1 path to a CSV file with the Vertices data.
+   * @param path2 path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   *Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static  GraphCsvReader fromCsvReader(String path1, String path2, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path1,path2,context));
+   }
+   /** Creates a graph from a CSV file for Edges., Vertices are
+   * induced from the edges.
+   *
+   * Edges with value are created from a CSV file with 3 fields. Vertices 
are created
+   * automatically and their values are set to NullValue.
+   *
+   * @param path a path to a CSV file with the Edges data
+   * @param context the flink execution environment.
+   * @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+   * Vertex ID, Vertex Value and Edge value returns a Graph
+   */
+
+   public static GraphCsvReader fromCsvReader(String path, 
ExecutionEnvironment context){
+   return (new GraphCsvReader(path,context));
+   }
+
+   /**
+*Creates a graph from a CSV file for Edges., Vertices are
+* induced from the edges and vertex values are calculated by a mapper
+* function.  Edges with value are created from a CSV file with 3 
fields.
+* Vertices are created automatically and their values are set by 
applying the provided map
+* function to the vertex ids.
+*
+* @param path a path to a CSV file with the Edges data
+* @param mapper the mapper function.
+* @param context the flink execution environment.
+* @return An instance of {@link org.apache.flink.graph.GraphCsvReader} 
, which on calling types() method to specify types of the
+* Vertex ID, Vertex Value and Edge value returns a Graph
+*/
+
+   public static GraphCsvReader fromCsvReader(String path, final 
MapFunction mapper,ExecutionEnvironment context)
--- End diff --

add type arguments to MapFunction


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590614#comment-14590614
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32675480
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590566#comment-14590566
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673158
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
--- End diff --

unused import


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590572#comment-14590572
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673274
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
--- End diff --

add type arguments to MapFunction


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590568#comment-14590568
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673221
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
--- End diff --

`edgePath` and `vertexPath` also seem to be unused


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590584#comment-14590584
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/847#discussion_r32673806
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/GraphCsvReader.java
 ---
@@ -0,0 +1,388 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph;
+import com.google.common.base.Preconditions;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.io.CsvReader;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.types.NullValue;
+import org.apache.flink.core.fs.Path;
+
+
+/**
+ * A class to build a Graph using path(s) provided to CSV file(s) with 
edge (vertices) data
+ * The class also configures the CSV readers used to read edges(vertices) 
data such as the field types,
+ * the delimiters (row and field),  the fields that should be included or 
skipped, and other flags
+ * such as whether to skip the initial line as the header.
+ * The configuration is done using the functions provided in The {@link 
org.apache.flink.api.java.io.CsvReader} class.
+ */
+
+
+
+public class GraphCsvReader{
+
+   private final Path path1,path2;
+   private final ExecutionEnvironment executionContext;
+
+   private Path edgePath;
+   private Path vertexPath;
+   protected CsvReader EdgeReader;
+   protected CsvReader VertexReader;
+   protected MapFunction mapper;
+

+//
+
+   public GraphCsvReader(Path path1,Path path2, ExecutionEnvironment 
context)
+   {
+   this.path1 = path1;
+   this.path2 = path2;
+   this.VertexReader = new CsvReader(path1,context);
+   this.EdgeReader = new CsvReader(path2,context);
+   this.mapper=null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2, ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = null;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader(Path path2,final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+   this.path1=null;
+   this.path2 = path2;
+   this.EdgeReader = new CsvReader(path2,context);
+   this.VertexReader = null;
+   this.mapper = mapper;
+   this.executionContext=context;
+   }
+
+   public GraphCsvReader (String path2,ExecutionEnvironment context)
+   {
+   this(new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+
+   }
+
+   public GraphCsvReader(String path1, String path2, ExecutionEnvironment 
context)
+   {
+   this(new Path(Preconditions.checkNotNull(path1, The file path 
may not be null.)),new Path(Preconditions.checkNotNull(path2, The file path 
may not be null.)), context);
+   }
+
+
+   public GraphCsvReader (String path2, final MapFunction mapper, 
ExecutionEnvironment context)
+   {
+
+   this(new Path(Preconditions.checkNotNull(path2, The 
file path may not be null.)),mapper, context);
+
+
+   }
+
+   public CsvReader getEdgeReader()
+   {
+   

[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589674#comment-14589674
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

GitHub user shghatge opened a pull request:

https://github.com/apache/flink/pull/846

Csv read graph


[FLINK-1520]Read edges and vertices from CSV files

Changes done-
1) Added a GraphCsvReader class which has 2 CsvReaders as members 
EdgeReader and VertexReader

To make smooth chaining of functions possible for configuration of the 
member CsvReaders implemented the configuration methods in CsvReader in 
GraphCsvReader so that all the configurations can be done on both CsvReaders on 
calling the function once and the methods again return a GraphCsvReader
Only the methods to specify which fields are supposed to be chosen from the 
individual files are separate for Edge and Vertex reader.

Since specifying types is necessary because of type-erasure, implemented a 
types method in the GraphCsvReader class that returns a Graph with the 
specified types as the type for VertexID, Vertex Value and Edge Value. Other 
way for doing this was sending the types in a method to construct the graph 
itself but to make it as similar to CsvReader as possible this approach was 
taken.

2) Added 3 methods in Graph.java similar to other methods for Graph 
creation. These methods use one mandatory path and one optional path and 
optional mapper for Graph Creation. Only difference is that these methods 
return an instance of GraphCsvReader instead of Graph.

3)Added appropriate methods in GraphCreationITCase and 
GraphCreationWithMapperITCase,java
Also added createTempFile() method to both to help with the tests.

4) Added the documentation for the new functionalities to gelly_guide.md
3)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shghatge/flink csv_readGraph

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/846.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #846


commit e8a5250b4588326b606b2f29d2f2c2f6e4554925
Author: Shivani shgha...@gmail.com
Date:   2015-06-10T11:22:37Z

[FLINK-2093][gelly] Added difference Method

commit b0a9228540fbafe819d1883e07087c4f59f4e4bb
Author: Shivani shgha...@gmail.com
Date:   2015-06-10T13:04:48Z

[FLINK-2093][gelly] Minor Changes in the Graph.java file

commit 24c7567b6fe533cd1a7cff1d9bb812ee5d8433fc
Author: Shivani shgha...@gmail.com
Date:   2015-06-15T10:37:00Z

Merge branch 'master' of https://github.com/apache/flink into difference_new

commit e726507d41a58abbc9a6abe2ead4e9e83b09
Author: Shivani shgha...@gmail.com
Date:   2015-06-15T12:13:58Z

[FLINK-2093][gelly]Added difference method

commit 760047dc78739b9eb750757aea442aa947c2fc34
Author: Shivani shgha...@gmail.com
Date:   2015-06-15T13:01:32Z

[FLINK-2093][gelly]Added difference method

commit 57f1b315f7fbb87c74085c9a68108fcd3ff58440
Author: Shivani shgha...@gmail.com
Date:   2015-06-15T14:09:29Z

[FLINK-2093][gelly]Added difference method

commit 9ca5d7485708c3dca7e41bf6d19b5bd9d492125f
Author: Shivani shgha...@gmail.com
Date:   2015-06-15T14:12:50Z

[FLINK-2093][gelly]Added difference method

commit eff14eff8c1ac1b88a76438257ad74c7a004bbd3
Author: Shivani shgha...@gmail.com
Date:   2015-06-16T16:37:36Z

[FLINK-1520]Read edges and vertices from CSV files

commit 1ae9eadc792500311c7fc24a8647364bc60902ec
Author: Shivani shgha...@gmail.com
Date:   2015-06-16T16:38:20Z

[FLINK-1520]Read edges and vertices from CSV files

commit c5f4410acf30d3d0bd12c4f215005aa517938dd2
Author: Shivani shgha...@gmail.com
Date:   2015-06-16T16:39:39Z

Merge branch 'master' of https://github.com/apache/flink into csv_readGraph

commit 342225af5f17fa7be906885ee3dcd1b2f6a6d176
Author: Shivani shgha...@gmail.com
Date:   2015-06-17T11:39:48Z

[FLINK-1520]Read edges and vertices from CSV files

commit 553e676003ee1419710146ddbe0a13e17fd3d237
Author: Shivani shgha...@gmail.com
Date:   2015-06-17T11:42:26Z

[FLINK-1520]Read edges and vertices from CSV files




 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589770#comment-14589770
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user shghatge closed the pull request at:

https://github.com/apache/flink/pull/846


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589722#comment-14589722
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/846#issuecomment-112783592
  
Hi @shghatge ,

It seems we have a bit of a mess in this PR. Nothing that cannot be fixed. 
Let's take it offline.


 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589774#comment-14589774
 ] 

ASF GitHub Bot commented on FLINK-1520:
---

GitHub user shghatge opened a pull request:

https://github.com/apache/flink/pull/847

[FLINK-1520]Read edges and vertices from CSV files



[FLINK-1520]Read edges and vertices from CSV files

Changes done-
1) Added a GraphCsvReader class which has 2 CsvReaders as members 
EdgeReader and VertexReader

To make smooth chaining of functions possible for configuration of the 
member CsvReaders implemented the configuration methods in CsvReader in 
GraphCsvReader so that all the configurations can be done on both CsvReaders on 
calling the function once and the methods again return a GraphCsvReader
Only the methods to specify which fields are supposed to be chosen from the 
individual files are separate for Edge and Vertex reader.

Since specifying types is necessary because of type-erasure, implemented a 
types method in the GraphCsvReader class that returns a Graph with the 
specified types as the type for VertexID, Vertex Value and Edge Value. Other 
way for doing this was sending the types in a method to construct the graph 
itself but to make it as similar to CsvReader as possible this approach was 
taken.

2) Added 3 methods in Graph.java similar to other methods for Graph 
creation. These methods use one mandatory path and one optional path and 
optional mapper for Graph Creation. Only difference is that these methods 
return an instance of GraphCsvReader instead of Graph.

3)Added appropriate methods in GraphCreationITCase and 
GraphCreationWithMapperITCase,java
Also added createTempFile() method to both to help with the tests.

4) Added the documentation for the new functionalities to gelly_guide.md


Closed the previous pull request and made a new one with a fresh branch 
because the previous changes are not merged yet.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shghatge/flink csv_clear_pull

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/847.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #847


commit b7c1079f9fe56a2586f36f8b5eca5208b33e9cf8
Author: Shivani shgha...@gmail.com
Date:   2015-06-17T13:37:36Z

[FLINK-1520]Read edges and vertices from CSV files




 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Andra Lungu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584874#comment-14584874
 ] 

Andra Lungu commented on FLINK-1520:


Yup, I am not happy with the argument passing, as it may be cumbersome for the 
user to get what each argument means etc.
I thought about this approach, my only concern is that it will introduce a ton 
of duplicate code. And, in the end, you write (more or less) the same commands, 
just that instead of getting a DataSet, which you then turn into a graph with 
fromDataSet, you get a graph directly...

If we are okay with code duplication then I would +1 Vasia's solution. 

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584886#comment-14584886
 ] 

Vasia Kalavri commented on FLINK-1520:
--

I don't think it'll be a lot of duplicate code. You can have EdgeCsvReader wrap 
a CsvReader and just call its methods, no?

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Andra Lungu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584586#comment-14584586
 ] 

Andra Lungu commented on FLINK-1520:


Hey [~vkalavri]],

To my knowledge, you cannot deduce the key or the value's class from the 
generic K,VV,EV. The way I would implement fromCsv is by adding the classes 
as parameters, e.g. Graph.fromCsv(edgesPath, String.class, String.class, 
context). For NullValue, then, we would have a single class argument 
Graph.fromCsv(edgesPath, String.class, context).
The user should know what kind of keys he/she has in there. So the extra 
parameters should not be that much of a burden. 

Is this what you had in mind? For the time being, I cannot see a smarter way 
of doing it :) 

The examples should be updated accordingly since they now read the edge and 
vertex data sets from CSV and then use fromDataSet to produce the graph. 

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-03 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570586#comment-14570586
 ] 

Vasia Kalavri commented on FLINK-1520:
--

Hey [~cebe]! One more ping to you :)
If you're not working on this, can I release this issue? Thanks!

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Carsten Brandt
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-03-16 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14363270#comment-14363270
 ] 

Vasia Kalavri commented on FLINK-1520:
--

Hi [~cebe]! Are you working on this? 
If you're stuck and need some help, let us know! Also, if you're simply too 
busy and can't currently work on this :-) 
Thanks!

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Carsten Brandt
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-02-22 Thread Carsten Brandt (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332277#comment-14332277
 ] 

Carsten Brandt commented on FLINK-1520:
---

[~rmetzger], thanks it is working!

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Carsten Brandt
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-02-13 Thread Carsten Brandt (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319956#comment-14319956
 ] 

Carsten Brandt commented on FLINK-1520:
---

[~vkalavri] you can assign me to this, will try to work on this next week.

https://github.com/project-flink/flink-graph/pull/64#issuecomment-73885671

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-02-13 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319970#comment-14319970
 ] 

Vasia Kalavri commented on FLINK-1520:
--

Perfect! It's all yours :)

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Carsten Brandt
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)