Re: loading vertices into RAM

Edward J. Yoon Tue, 21 Jan 2014 16:24:27 -0800

There's no way to solve your problem until implementing disk-based
sort queue. So, you should increase the max heap size and number of
machines.


On Tue, Jan 21, 2014 at 11:58 PM, Ammar Sahib <[email protected]> wrote:
> Hi Edward
>
> I tried to run my program with the option of DiskVerticesInfo using a cluster 
> of 5 "virtual" machines each with 4 GB of RAM. I configured the heap memory 
> to 2048 MB (-Xmx2048m).
>
> I am working with graph consists of 10 million vertices. After a round 3 
> hours I get the error of Java heap space. Do you think that using a virtual 
> machines instead of real physical machines might have something to do with 
> this problem?
>
> The problem that I get:
> 14/01/21 15:22:23 ERROR bsp.LocalBSPRunner: Exception during BSP execution!
> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java 
> heap space
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>         at 
> org.apache.hama.bsp.LocalBSPRunner$ThreadObserver.run(LocalBSPRunner.java:313)
>         at java.lang.Thread.run(Thread.java:724)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>
>
>
> My configuration file content:
>
> <configuration>
> <property>
>     <name>bsp.master.address</name>
>     <value>master</value>
> </property>
> <property>
>     <name>bsp.system.dir</name>
>     <value>/tmp/hama-hadoop/bsp/system</value>
> </property>
> <property>
>     <name>bsp.local.dir</name>
>     <value>/tmp/hama-hadoop/bsp/local</value>
> </property>
> <property>
>     <name>hama.tmp.dir</name>
>     <value>/tmp/hama-hadoop</value>
> </property>
> <property>
>     <name>fs.default.name</name>
>     <value>hdfs://master:54310</value>
> </property>
> <property>
>     <name>hama.zookeeper.quorum</name>
>     <value>master,slave1,slave2,slave3,slave4</value>
> </property>
> <property>
> <name>bsp.child.java.opts</name>
> <value>-Xmx2048m</value>
> </property>
> </configuration>
>
>
>
>
>
> On Tuesday, January 21, 2014 2:52 AM, Edward J. Yoon <[email protected]> 
> wrote:
>
> To use OffHeapVerticesInfo, you need to add Apache DirectMemory
> libraries to lib folder.
>
> or, Try with DiskVerticesInfo.
>
> With trunk version, I was able to run 30 thousand vertices graph on
> single machine, and 1B vertices on a full rack cluster (child opt:
> -Xmx2048m).
>
>
> On Tue, Jan 21, 2014 at 1:57 AM, Ammar Sahib <[email protected]> wrote:
>> Hi
>>
>> Thanks for the reply. I am using the HAMA version from the TRUNK and I am 
>> running my own developed algorithm. I am trying to work with a grapg 
>> consists of 10 million vertices.  Did someone experienced working with big 
>> graphs (millions of vertices) using HAMA? can you please share your 
>> experience?
>>
>>
>> I am trying now to use:
>>
>> Conf.setClass("
>> hama.graph.vertices.info",org.apache.hama.graph.
>> OffHeapVerticesInfo.class,org.apache.hama.graph.VerticesInfo.class);
>>
>>
>  I get the error:
>>
>>
>> 14/01/20 17:42:16 ERROR bsp.LocalBSPRunner: Exception during BSP execution!
>> java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
>> org/apache/directmemory/utils/CacheValuesIterable
>>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>>         at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>>         at 
>> org.apache.hama.bsp.LocalBSPRunner$ThreadObserver.run(LocalBSPRunner.java:313)
>>         at java.lang.Thread.run(Thread.java:724)
>> Caused by: java.lang.NoClassDefFoundError: 
>> org/apache/directmemory/utils/CacheValuesIterable
>>         at
>  
> org.apache.hama.graph.OffHeapVerticesInfo.skippingIterator(OffHeapVerticesInfo.java:112)
>>         at 
>> org.apache.hama.graph.GraphJobRunner.cleanup(GraphJobRunner.java:163)
>>         at 
>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:262)
>>         at 
>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
>>         at 
>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
>>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>         at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>> The log of my master is as following:
>>
>> /************************************************************
>> STARTUP_MSG: Starting BSPMaster
>> STARTUP_MSG:   host = c3-large1-master/10.255.255.2
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.2.0
>> STARTUP_MSG:   build = 
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 
>> 1479473; compiled by 'hortonfo' on Mon May  6 18:29:07 UTC 2013
>> STARTUP_MSG:   java = 1.7.0_25
>> ************************************************************/
>> 2014-01-14 21:27:35,808 INFO org.apache.hama.bsp.BSPMaster: RPC BSPMaster: 
>> host master port 40000
>> 2014-01-14 21:27:37,200 INFO org.apache.hama.ipc.Server: Starting Socket 
>> Reader #1 for port 40000
>> 2014-01-14 21:27:37,732 INFO org.mortbay.log: Logging to 
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via 
>> org.mortbay.log.Slf4jLog
>> 2014-01-14 21:27:38,147 INFO org.apache.hama.http.HttpServer: Port returned 
>> by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening 
>> the
>  listener on 40013
>> 2014-01-14 21:27:38,168 INFO org.apache.hama.http.HttpServer: 
>> listener.getLocalPort() returned 40013 
>> webServer.getConnectors()[0].getLocalPort() returned 40013
>> 2014-01-14 21:27:38,168 INFO org.apache.hama.http.HttpServer: Jetty bound to 
>> port 40013
>> 2014-01-14 21:27:38,168 INFO org.mortbay.log: jetty-6.1.14
>> 2014-01-14 21:27:38,446 INFO org.mortbay.log: Extract 
>> jar:file:/usr/local/hama-0.6.3/hama-core-0.6.3.jar!/webapp/bspmaster/ to 
>> /tmp/Jetty_master_40013_bspmaster____ge2lxf/webapp
>> 2014-01-14 21:27:40,162 INFO org.mortbay.log: Started 
>> SelectChannelConnector@master:40013
>> 2014-01-14 21:27:40,734 INFO org.apache.hama.bsp.BSPMaster: Cleaning up the 
>> system directory
>> 2014-01-14 21:27:40,734
>  INFO org.apache.hama.bsp.BSPMaster: 
> hdfs://master:54310/tmp/hama-hadoop/bsp/system
>> 2014-01-14 21:27:40,991 INFO org.apache.hama.bsp.sync.ZKSyncBSPMasterClient: 
>> Initialized ZK false
>> 2014-01-14 21:27:40,991 INFO org.apache.hama.bsp.sync.ZKSyncClient: 
>> Initializing ZK Sync Client
>> 2014-01-14 21:27:41,073 INFO org.apache.hama.ipc.Server: IPC Server 
>> Responder: starting
>> 2014-01-14 21:27:41,077 INFO org.apache.hama.ipc.Server: IPC Server listener 
>> on 40000: starting
>> 2014-01-14 21:27:41,085 INFO org.apache.hama.ipc.Server: IPC Server handler 
>> 0 on 40000: starting
>> 2014-01-14 21:27:41,088 INFO org.apache.hama.bsp.BSPMaster: Starting RUNNING
>> 2014-01-14 21:27:41,168 INFO org.apache.hama.bsp.BSPMaster: 
>> groomd_slave2_50000 is added.
>> 2014-01-14 21:27:49,634 INFO org.apache.hama.bsp.BSPMaster:
>  groomd_slave1_50000 is added.
>> 2014-01-14 21:28:15,943 INFO org.apache.hama.bsp.BSPMaster: 
>> groomd_master_50000 is added.
>>
>>
>>
>>
>>
>>
>>
>> On Sunday, January 19, 2014 7:58 AM, Tommaso Teofili 
>> <[email protected]> wrote:
>>
>> yes, the correct way of setting OffHeapVI is: conf.setClass("
>> hama.graph.vertices.info",org.apache.hama.graph.
>> OffHeapVerticesInfo.class,org.apache.hama.graph.VerticesInfo.class);
>>
>> Apart from that, what Hama version are you running on?
>> Looking at the code in trunk it shouldn't be possible to have a NPE on the
>> currentVertex if the iterator is consumed correctly, instead if one doesn't
>> call hasNext before next and / or calls next even if hasNext returns false
>> then it's possible to have that NPE.
>> Also what algorithm / example are you running? Any useful information (like
>> environment, execution mode, logs, version, etc.) would be useful to help
>> you.
>>
>> Tommaso
>>
>>
>>
>>
>> 2014/1/19 步青云 <[email protected]>
>>
>>> I got the same problem about loading vertices into RAM.And I try to use
>>> off OffHeapVerticesInfo.
>>> You may use the
>  method setClass like this:
>>> conf.setClass("hama.graph.vertices.info
>>> ",org.apache.hama.graph.OffHeapVerticesInfo.class,org.apache.hama.graph.VerticesInfo.class);
>>> However,I got the Nullexception using OffHeapVerticesInfo.The errors are
>>> as follows:
>>>
>>> 14/01/18 20:54:23 ERROR bsp.LocalBSPRunner: Exception during BSP execution!
>>> java.lang.NullPointerException
>>>     at
>>> org.apache.hama.graph.OffHeapVerticesInfo$1.next(OffHeapVerticesInfo.java:139)
>>>     at
>>> org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:251)
>>>     at org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:145)
>>>     at
>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:256)
>>>     at
>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
>>>
>>> Anyone could help me to solve this problem?Thanks a lot.
>>>
>>>
>>>
>>>
>>> ------------------ Original ------------------
>>> From:  "Ammar Sahib"<[email protected]>;
>>> Date:  Jan 17, 2014
>>> To:  "[email protected]"<[email protected]>;
>>>
>>> Subject:  Re: loading vertices into RAM
>>>
>>>
>>>
>>> I think we are getting close now, However now I have runtime exception:
>>>
>>> Exception in thread "main" java.lang.RuntimeException: interface
>>> org.apache.hama.graph.VerticesInfo not
>>> org.apache.hama.graph.ListVerticesInfo
>>>     at
>>> org.apache.hadoop.conf.Configuration.setClass(Configuration.java:858)
>>>
>>>
>>>
>>>
>>>
>>> On Friday, January 17, 2014 2:30 PM, Tommaso Teofili <
>>> [email protected]> wrote:
>>>
>>> ah yes, sorry, you also have to specify the interface, I don't have the
>>> code in front of me but it should be :
>>>
>>> conf.setClass("hama.graph.vertices.info",
>>> org.apache.hama.graph.VerticesInfo.class, org.apache.
>>> hama.graph.ListVerticesInfo.class);
>>>
>>> Tommaso
>>>
>>>
>>>
>>> 2014/1/17 Ammar Sahib <[email protected]>
>>>
>>> > Hi
>>> >
>>> >
>  Thanks for your reply. I used now:
>>> >
>>> > conf.setClass("hama.graph.vertices.info
>>> > ",org.apache.hama.graph.ListVerticesInfo.class);
>>> >
>>> > Now I get this error:
>>> > The method setClass(String, Class<?>, Class<?>) in the type Configuration
>>> > is not applicable for the arguments (String, Class<ListVerticesInfo>)
>>> >
>>> > I am using HAMA 0.6.3
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Friday, January 17, 2014 12:59 PM, Tommaso Teofili <
>>> > [email protected]> wrote:
>>> >
>>> > you're passing the fully qualified name of the Class as a String to a
>>> > method setClass(String, Class) while you should pass the Class itself,
>>> > e.g.:
>>> > HamaConfiguration conf = new HamaConfiguration();
>>> > conf.setClass("hama.graph.vertices.info",org.apache.
>>> > hama.graph.ListVerticesInfo.class);
>>> >
>>> > Hope this helps,
>>> > Tommaso
>>> >
>>> >
>>> >
>>> >
>>> > 2014/1/17 Ammar Sahib <[email protected]>
>>> >
>>> > > Hi
>>> > >
>>> > > I am trying to evaluate the different implementation below:
>>> > >
>>> > >
>>> > > - ListVerticesinfo: loads vertices into array list.
>>> > > - MapVerticesinfo: loads vertices into tree map.
>>> > > - DiskVerticesInfo: loads vertices into a local file.
>>> > >
>>> > > When using the conf.setClass method I got an error. Below is sample of
>>> my
>>> > > code:
>>> > > HamaConfiguration conf = new HamaConfiguration();
>>> > > conf.setClass("hama.graph.vertices.info
>>> > > ","org.apache.hama.graph.ListVerticesInfo");
>>> > >
>>> > > The error I am getting is:
>>> > > The method setClass(String, Class<?>, Class<?>) in the type
>>> Configuration
>>> > > is not applicable for the arguments (String, String).
>>> > >
>>> > > However I found that I can use conf.set method.
>>> > >
>>> > >
>>> > > Can someone tell me what is I am doing wrong?
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > On Wednesday, January 15, 2014 8:01 AM, Tommaso Teofili <
>>> > > [email protected]> wrote:
>>> > >
>>> > > and OffHeapVerticesInfo for loading vertices off heap, which is
>>> available
>>> > > with 0.6.3 as well if I recall correctly.
>>> > > Tommaso
>>> > >
>>> > >
>>> > >
>>> > > 2014/1/15 Edward J. Yoon <[email protected]>
>>> > >
>>> > > > There are few implementations.
>>> > > >
>>> > > >  - ListVerticesinfo:
>  loads vertices into array list.
>>> > > >  - MapVerticesinfo: loads vertices into tree map.
>>> > > >  - DiskVerticesInfo: loads vertices into a local file.
>>> > > >
>>> > > > You can choose one of them by setting the "hama.graph.vertices.info"
>>> > > > in job configuration.
>>> > > >
>>> > > >   > conf.setClass("hama.graph.vertices.info",
>>> > > > "org.apache.hama.graph.ListVerticesInfo".
>>> > > >
>>> > > > With the latest 0.6.3 version, you can use only ListVerticesInfo.
>>> > > > Please use the TRUNK.
>>> > > >
>>> > > >
>>>
>  > > > On Tue, Jan 14, 2014 at 11:18 PM, Ammar Sahib <[email protected]
>>> >
>>> > > > wrote:
>>> > > > > Hi
>>> > > > >
>>> > > > > According to the BSP model, the data is processed in the RAM and
>>> that
>>> > > is
>>> > > > the reason why Pregel model is faster than the MapReduce (MapReduce
>>> > > > writedown to disk). Can someone explains to me how to be sure that
>>> all
>>> > > the
>>> > > > graph vertices are actually been loaded in RAM?
>>> > > > >
>>> > > > >
>>> > > > > How would HAMA behave if the vertices values are so big such that
>>> the
>>> > > > available RAM memory is not enough to contains all of the vertices?
>>> > > > >
>>> > > > > Regards
>>> > > >
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Best Regards, Edward J. Yoon
>>> > > > @eddieyoon
>
>>> > > >
>>> > >
>>> >
>>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: loading vertices into RAM

Reply via email to