Hi Guys, Any help on this will be appreciated. I am repeating my question and my code below:
I am implementing an algorithm in Giraph that reads the vertex values from two input files, each has its own format. I am not using any EdgeInputFormatClass. I am now using VertexInputFormatDescription along with MultiVertexInputFormats, but still could not figure out how to set the Vertex input path for each Input Format Class. Can you please take a look at my code below and show me how to set the Vertex Input Path? I have taken a look at HiveGiraphRunner but still no luck. Thanks if (null == getConf()) { conf = new Configuration(); } GiraphConfiguration gconf = new GiraphConfiguration(getConf()); int workers = Integer.parseInt(arg0[2]); gconf.setWorkerConfiguration(workers, workers, 100.0f); List<VertexInputFormatDescription> vertexInputDescriptions = Lists.newArrayList(); // Input one VertexInputFormatDescription description1 = new VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class); // how to set the vertex input path? i.e. how to say that I want to read file1.txt using this input format class vertexInputDescriptions.add(description1); // Input two VertexInputFormatDescription description2 = new VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class); // how to set the vertex input path? vertexInputDescriptions.add(description2); GiraphConstants.VERTEX_INPUT_FORMAT_CLASS.set(gconf, MultiVertexInputFormat.class); VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions)); gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class); gconf.setComputationClass(UseCase1Vertex.class); GiraphJob job = new GiraphJob(gconf, "Use Case 1"); FileOutputFormat.setOutputPath(job.getInternalJob(), new Path(arg0[1])); return job.run(true) ? 0 : -1; Thanks in advance. Best, Yasser From: Yasser Altowim [mailto:yasser.alto...@ericsson.com] Sent: Friday, August 16, 2013 11:36 AM To: user@giraph.apache.org Subject: RE: MultiVertexInputFormat Thanks a lot Avery for your response. I am now using VertexInputFormatDescription, but still could not figure out how to set the Vertex input path. I just need to read the vertex values from two different files, each with its own format. I am not using any EdgeInputFormatClass. Can you please take a look at my code below and show me how to set the Vertex Input Path? Thanks if (null == getConf()) { conf = new Configuration(); } GiraphConfiguration gconf = new GiraphConfiguration(getConf()); int workers = Integer.parseInt(arg0[2]); gconf.setWorkerConfiguration(workers, workers, 100.0f); List<VertexInputFormatDescription> vertexInputDescriptions = Lists.newArrayList(); // Input one VertexInputFormatDescription description1 = new VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class); // how to set the vertex input path? vertexInputDescriptions.add(description1); // Input two VertexInputFormatDescription description2 = new VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class); // how to set the vertex input path? vertexInputDescriptions.add(description2); VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions)); gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class); gconf.setComputationClass(UseCase1Vertex.class); GiraphJob job = new GiraphJob(gconf, "Use Case 1"); FileOutputFormat.setOutputPath(job.getInternalJob(), new Path(arg0[1])); return job.run(true) ? 0 : -1; Best, Yasser From: Avery Ching [mailto:ach...@apache.org] Sent: Friday, August 16, 2013 9:50 AM To: user@giraph.apache.org<mailto:user@giraph.apache.org> Subject: Re: MultiVertexInputFormat This is doable in Giraph, you can use as many vertex or edge input formats as you like (via GIRAPH-639). You just need to choose MultiVertexInputFormat and/or MultiEdgeInputFromat See VertexInputFormatDescription for vertex input formats /** * VertexInputFormats description - JSON array containing a JSON array for * each vertex input. Vertex input JSON arrays contain one or two elements - * first one is the name of vertex input class, and second one is JSON object * with all specific parameters for this vertex input. For example: * [["VIF1",{"p":"v1"}],["VIF2",{"p":"v2","q":"v"}]] */ public static final StrConfOption VERTEX_INPUT_FORMAT_DESCRIPTIONS = new StrConfOption("giraph.multiVertexInput.descriptions", null, "VertexInputFormats description - JSON array containing a JSON " + "array for each vertex input. Vertex input JSON arrays contain " + "one or two elements - first one is the name of vertex input " + "class, and second one is JSON object with all specific parameters " + "for this vertex input. For example: [[\"VIF1\",{\"p\":\"v1\"}]," + "[\"VIF2\",{\"p\":\"v2\",\"q\":\"v\"}]]\""); See EdgeInputFormatDescription for edge input formats /** * EdgeInputFormats description - JSON array containing a JSON array for * each edge input. Edge input JSON arrays contain one or two elements - * first one is the name of edge input class, and second one is JSON object * with all specific parameters for this edge input. For example: * [["EIF1",{"p":"v1"}],["EIF2",{"p":"v2","q":"v"}]] */ public static final StrConfOption EDGE_INPUT_FORMAT_DESCRIPTIONS = new StrConfOption("giraph.multiEdgeInput.descriptions", null, "EdgeInputFormats description - JSON array containing a JSON array " + "for each edge input. Edge input JSON arrays contain one or two " + "elements - first one is the name of edge input class, and second " + "one is JSON object with all specific parameters for this edge " + "input. For example: [[\"EIF1\",{\"p\":\"v1\"}]," + "[\"EIF2\",{\"p\":\"v2\",\"q\":\"v\"}]]"); Hope that helps, Avery On 8/16/13 8:45 AM, Yasser Altowim wrote: Guys, any help with this will be appreciated. Thanks. From: Yasser Altowim [mailto:yasser.alto...@ericsson.com] Sent: Thursday, August 15, 2013 2:07 PM To: user@giraph.apache.org<mailto:user@giraph.apache.org> Subject: MultiVertexInputFormat Hi, I am implementing an algorithm using Giraph. My algorithm needs to read input data from two files, each has its own format. My questions are: 1. How can I use the MultiVertexInputFormat class? Is there any example that shows how this class can be used? 2. How can I specify this class when running my job using the Giraph Runner or using a driver class? Thanks in advance. Best, Yasser