RE: BUG ? SequenceFileVertexInputFormat with custom Writable : can not iterate over vertices

2014-09-10 Thread olivier.varene
Anyone ?

-Message d'origine-
De : VARENE Olivier DTSI/DSI 
Envoyé : jeudi 4 septembre 2014 11:47
À : user@giraph.apache.org
Objet : BUG ? SequenceFileVertexInputFormat with custom Writable : can not 
iterate over vertices

Hi,

first of all many thanks to the community for the great work you are doing.


with giraph 1.1-SNAPSHOT
built for hadoop 2.4.0


I am stalled on a bug using custom SequenceFile Vertex Input format.


I have opened a project on github to allow you to run/play/view the bug

https://github.com/ovarene/giraph_SequenceFileVertexInput


you will find everything needed :D



I try to load/create my vertices from a SequenceFileLongWritable,MyWritable
LongWritable being the vertexId
MyWritable being a custom value writable used to generate VertexValue and its 
edges


Vertices are created correctly and loaded (as seen on the log)

2014-09-04 10:44:44,683 DEBUG [load-0] MyReader 
(SequenceFileLongMyVertexInputFormat.java:getCurrentVertex(108)) - Create 
Vertex from 1.0 0.0 0: to : Vertex(id=1,value=1.0,#edges=0) 2014-09-04 
10:44:44,694 DEBUG [load-0] MyReader 
(SequenceFileLongMyVertexInputFormat.java:getCurrentVertex(108)) - Create 
Vertex from 2.0 0.0 2:1|6| to : Vertex(id=2,value=2.0,#edges=2) 2014-09-04 
10:44:44,695 DEBUG [load-0] MyReader 
(SequenceFileLongMyVertexInputFormat.java:getCurrentVertex(108)) - Create 
Vertex from 5.0 0.0 3:4|1|6| to : Vertex(id=5,value=5.0,#edges=3)

2014-09-04 10:44:44,697 INFO [load-0] worker.InputSplitsCallable 
(InputSplitsCallable.java:call(235)) - call: Loaded 1 input splits in 
0.12198963 secs, (v=3, e=5) 24.592255 vertices/sec, 40.98709 edges/sec


But alas, when trying to iterate over them during a compute session, I get the 
following error

ERROR : But when reading in PrintVertex (BasicComputation): 2014-09-04 
10:44:44,701 ERROR [load-0] utils.LogStacktraceCallable 
(LogStacktraceCallable.java:call(57)) - Execution of callable failed 
java.lang.IllegalStateException: next: IOException at 
org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:101) at 
org.apache.giraph.partition.BasicPartition.addPartitionVertices(BasicPartition.java:99)
 at 
org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:110)
 at 
org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:482)
 at 
org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:428)
 at 
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241) 
at 
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60) 
at 
org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: 
ensureRemaining: Only 5 bytes remaining, trying to read 8 at 
org.apache.giraph.utils.UnsafeByteArrayInputStream.ensureRemaining(UnsafeByteArrayInputStream.java:114)
 at 
org.apache.giraph.utils.UnsafeByteArrayInputStream.readLong(UnsafeByteArrayInputStream.java:197)
 at org.apache.hadoop.io.LongWritable.readFields(LongWritable.java:47) at 
org.apache.giraph.utils.WritableUtils.reinitializeVertexFromDataInput(WritableUtils.java:522)
 at org.apache.giraph.utils.VertexIterator.next(VertexIterator.java:98) ... 11 
more


Any idea ?

Thanks a lot

Olivier










_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



Re: How do I validate customArguments?

2014-09-10 Thread Matthew Saltz
No worries. Just by the way, I realized after I sent that that using the
public static int numPreprocessingSteps to store the value in the
MasterCompute class doesn't work; you need to register a permanent
aggregator to hold on to it, if you need to.

Best,
Matthew

On Wed, Sep 10, 2014 at 3:15 PM, Matthew Cornell m...@matthewcornell.org
wrote:

 Sorry for the long delay, Matthew. That's really helpful. Right now I'm
 stuck on apparently running out of memory on our little cluster, but the
 log messages are confusing. I'm putting together a question, but in the
 meantime I'll try one of the simpler examples such as degree count to see
 if /anything/ will run against my graph, which is very small (100K and
 edges nodes). -- matt

 On Thu, Aug 28, 2014 at 2:26 PM, Matthew Saltz sal...@gmail.com wrote:

 Matt,

 I'm not sure if you've resolved this problem already or not, but if you
 haven't: The initialize() method isn't limited to registering aggregators,
 and in fact, in my project I use it to do exactly what you're describing to
 check and load custom configuration parameters. Inside the initialize()
 method, I do this:

 *String numPreprocessingStepsConf =
 getConf().get(NUMBER_OF_PREPROCESSING_STEPS_CONF_OPT);*
 *numPreprocessingSteps = (numPreprocessingStepsConf != null) ?*
 *Integer.parseInt(numPreprocessingStepsConf.trim()) :*
 *DEFAULT_NUMBER_OF_PREPROCESSING_STEPS;*
 *System.out.println(Number of preprocessing steps:  +
 numPreprocessingSteps);*

 where at the class level I declare:

   public static final String NUMBER_OF_PREPROCESSING_STEPS_CONF_OPT =
 wcc.numPreprocessingSteps;
   public static final int DEFAULT_NUMBER_OF_PREPROCESSING_STEPS = 1;
   public static int numPreprocessingSteps;

 To set the property, I use the option -ca
 wcc.numPreprocessingSteps=number of steps I want. If you need to check
 that it's properly formatted and not store them, this is a fine place to do
 it as well, given that it's run before the input superstep (see the giraph
 code in BspServiceMaster, line 1617 in the stable 1.1.0 release). What
 happens is that on the master, the MasterThread calls coordinateSuperstep()
 on a BspServiceMaster object, which checks if it's the input superstep, and
 if so, calls initialize() on the MasterCompute object (created in the
 becomeMaster() method of BspServiceMaster).

 Hope this helps,
 Matthew



 On Tue, Aug 26, 2014 at 4:36 PM, Matthew Cornell m...@matthewcornell.org
  wrote:

 Hi again. My application needs to pass in a String argument to the
 computation which each Vertex needs access to. (The argument is a list of
 the form [item1, item2, ...].) I found --customArguments (which I set in
 my tests via conf.set(arg_name, arg_val)) but I need to check that it's
 properly formatted. Where do I do that? The only thing I thought of is to
 specify a DefaultMasterCompute subclass whose initialize() does the check,
 but all the initialize() examples do is register aggregators; none of them
 check args or do anything else. Thanks in advance! -- matt

 --
 Matthew Cornell | m...@matthewcornell.org | 413-626-3621 | 34 Dickinson
 Street, Amherst MA 01002 | matthewcornell.org





 --
 Matthew Cornell | m...@matthewcornell.org | 413-626-3621 | 34 Dickinson
 Street, Amherst MA 01002 | matthewcornell.org