Yes, first job is done to distribute the slices of input data among
BSP processors.

On Sat, Apr 27, 2013 at 1:58 AM, Leonidas Fegaras <[email protected]> wrote:
> OK. MRQL works fine now with Hama 0.7.0 in distributed mode.
> I haven't tested it on a real cluster yet.
> I am attaching the output from pagerank.
> By the way, Hama 0.7.0 runs 2 jobs for each BSPjob, although the first is
> fast.
> Is this done to distribute the data among peers?
> Leonidas
>
> 13/04/26 10:13:50 INFO mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> *** Using 8 BSP tasks (out of a max 8). Each task will handle about 2525538
> bytes of input data.
> 13/04/26 10:13:50 INFO bsp.FileInputFormat: Total input paths to process : 1
> 13/04/26 10:13:50 INFO bsp.FileInputFormat: Total input paths to process : 1
> 13/04/26 10:13:50 INFO bsp.BSPJobClient: Running job: job_201304260948_0020
> 13/04/26 10:13:53 INFO bsp.BSPJobClient: Current supersteps number: 0
> 13/04/26 10:14:02 INFO bsp.BSPJobClient: Current supersteps number: 2
> 13/04/26 10:14:05 INFO bsp.BSPJobClient: The total number of supersteps: 2
> 13/04/26 10:14:05 INFO bsp.BSPJobClient: Counters: 6
> 13/04/26 10:14:05 INFO bsp.BSPJobClient:
> org.apache.hama.bsp.JobInProgress$JobCounter
> 13/04/26 10:14:05 INFO bsp.BSPJobClient:     SUPERSTEPS=2
> 13/04/26 10:14:05 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=1
> 13/04/26 10:14:05 INFO bsp.BSPJobClient:
> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
> 13/04/26 10:14:05 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=2
> 13/04/26 10:14:05 INFO bsp.BSPJobClient:     TIME_IN_SYNC_MS=178
> 13/04/26 10:14:05 INFO bsp.BSPJobClient: IO_BYTES_READ=20204222
> 13/04/26 10:14:05 INFO bsp.BSPJobClient: TASK_INPUT_RECORDS=918362
> 13/04/26 10:14:05 INFO bsp.FileInputFormat: Total input paths to process : 8
> 13/04/26 10:14:06 INFO bsp.BSPJobClient: Running job: job_201304260948_0019
> 13/04/26 10:14:09 INFO bsp.BSPJobClient: Current supersteps number: 0
> 13/04/26 10:14:18 INFO bsp.BSPJobClient: Current supersteps number: 2
> 13/04/26 10:14:30 INFO bsp.BSPJobClient: Current supersteps number: 3
> 13/04/26 10:14:33 INFO bsp.BSPJobClient: Current supersteps number: 4
> 13/04/26 10:14:36 INFO bsp.BSPJobClient: Current supersteps number: 5
> 13/04/26 10:14:42 INFO bsp.BSPJobClient: Current supersteps number: 6
> 13/04/26 10:14:45 INFO bsp.BSPJobClient: Current supersteps number: 8
> 13/04/26 10:14:54 INFO bsp.BSPJobClient: Current supersteps number: 11
> 13/04/26 10:15:03 INFO bsp.BSPJobClient: Current supersteps number: 14
> 13/04/26 10:15:12 INFO bsp.BSPJobClient: Current supersteps number: 18
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: Current supersteps number: 19
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: The total number of supersteps: 19
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: Counters: 9
> 13/04/26 10:15:15 INFO bsp.BSPJobClient:
> org.apache.hama.bsp.JobInProgress$JobCounter
> 13/04/26 10:15:15 INFO bsp.BSPJobClient:     SUPERSTEPS=19
> 13/04/26 10:15:15 INFO bsp.BSPJobClient:     LAUNCHED_TASKS=8
> 13/04/26 10:15:15 INFO bsp.BSPJobClient:
> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
> 13/04/26 10:15:15 INFO bsp.BSPJobClient:     SUPERSTEP_SUM=152
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: TIME_IN_SYNC_MS=132721
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: IO_BYTES_READ=22986388
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: TOTAL_MESSAGES_SENT=5694804
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: TASK_INPUT_RECORDS=918362
> 13/04/26 10:15:15 INFO bsp.BSPJobClient:     COMPRESSED_MESSAGES=8
> 13/04/26 10:15:15 INFO bsp.BSPJobClient: TOTAL_MESSAGES_RECEIVED=5694804
>
>
>
>
>
> On 04/25/2013 08:05 PM, Edward J. Yoon wrote:
>>
>> Oh.. thanks. Here's another snapshot:
>>
>> http://people.apache.org/~edwardyoon/dist/0.7.0-SNAPSHOT/hama-0.7.0-SNAPSHOT2.tar.gz
>>
>> I've tested successfully on my laptop. Can you please test one more
>> time with this?
>>
>> On Fri, Apr 26, 2013 at 1:44 AM, Leonidas Fegaras <[email protected]>
>> wrote:
>>>
>>> OK. I tested it on my 8-core laptop. It seems that the problem with comma
>>> separated HDFS files in distributed mode has not been fixed yet:
>>>
>>>
>>> FileInputFormat.setInputPaths(job,"hdfs://localhost:9000/user/fegaras/tests/data/orders.tbl,hdfs://localhost:9000/user/fegaras/tests/data/customer.tbl");
>>>
>>>
>>> I get the error:
>>> java.net.URISyntaxException: Relative path in absolute URI:
>>> localhost:9000
>>>
>>> So I can't do joins.
>>> Queries that work on a single input file work fine in distributed mode.
>>> Their runtime on my laptop is comparable to that of Hama 0.5.0.
>>> Leonidas
>>>
>>>
>>>
>>> On 04/24/2013 03:25 AM, Edward J. Yoon wrote:
>>>>
>>>> Leonidas,
>>>>
>>>> Could you please test with
>>>> http://people.apache.org/~edwardyoon/dist/0.7.0-SNAPSHOT/ and feedback
>>>> me?
>>>>
>>>> On Tue, Apr 23, 2013 at 11:07 PM, Leonidas Fegaras <[email protected]>
>>>> wrote:
>>>>>
>>>>> Yes, I think this is fine. I can test a pre-release of Hama 0.6.2 to
>>>>> make
>>>>> sure that works well with MRQL.
>>>>> I have also extended the MRQL make/ant files to work with Yarn. They
>>>>> will
>>>>> be
>>>>> part of the next patch. I have tested MRQL on Yarn in local mode only
>>>>> because I don't have access to a Yarn cluster.
>>>>> Leonidas
>>>>>
>>>>>
>>>>>
>>>>> On Apr 22, 2013, at 6:10 PM, Edward J. Yoon wrote:
>>>>>
>>>>>> Since Hama 0.6 version is more memory efficient than the old version,
>>>>>> let's try to release based on Hama 0.6.* version. I want to evaluate
>>>>>> MRQL's both MR version and BSP version, with large data sets on my
>>>>>> cluster. I'll fix that problem soon and release Hama 0.6.2. What do
>>>>>> you think?
>>>>>>
>>>>>> On Thu, Apr 18, 2013 at 6:22 AM, Edward J. Yoon
>>>>>> <[email protected]>
>>>>>> wrote:
>>>>>>>
>>>>>>> +1
>>>>>>>
>>>>>>> On Thu, Apr 18, 2013 at 12:12 AM, Leonidas Fegaras
>>>>>>> <[email protected]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Edward,
>>>>>>>> Unfortunately, the current MRQL doesn't work correctly with Hama
>>>>>>>> 0.6.x.
>>>>>>>> It
>>>>>>>> works fine with Hama 0.5.0.
>>>>>>>> (The splits generated by the FileInputFormat in Hama 0.6.0 cannot be
>>>>>>>> smaller
>>>>>>>> than a block, while Hama 0.6.1 doesn't work correctly with comma
>>>>>>>> separated
>>>>>>>> paths, which prevents joins).
>>>>>>>> We can wait for the next Hama release (date?) or we can just release
>>>>>>>> it
>>>>>>>> as
>>>>>>>> is for Hama 0.5.0.
>>>>>>>> In either case, let's put a tentative release date:  May 15, so we
>>>>>>>> will
>>>>>>>> have
>>>>>>>> one month to write all guides and to setup a testbed.
>>>>>>>> Do you agree to have our first release on May 15?
>>>>>>>> Leonidas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Apr 17, 2013, at 2:55 AM, Edward J. Yoon wrote:
>>>>>>>>
>>>>>>>>> I personally would recommend you release a first Apache MRQL (with
>>>>>>>>> a
>>>>>>>>> well-described guide on how to get started or involved) that works
>>>>>>>>> with open source Apache Hadoop 1.0 and Hama 0.6.x.
>>>>>>>>>
>>>>>>>>> On Sat, Apr 13, 2013 at 12:38 AM, Leonidas Fegaras
>>>>>>>>> <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I think the obvious person to manage the first release is me, if
>>>>>>>>>> there
>>>>>>>>>> is
>>>>>>>>>> no
>>>>>>>>>> other volunteer.
>>>>>>>>>> I don't have any experience with release plans. Do we need to
>>>>>>>>>> setup
>>>>>>>>>> a
>>>>>>>>>> timeline for future releases?
>>>>>>>>>> Maybe we should develop a testbed first to be run on different
>>>>>>>>>> cluster
>>>>>>>>>> sizes
>>>>>>>>>> before each official release.
>>>>>>>>>> Leonidas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Apr 11, 2013, at 8:42 PM, Edward J. Yoon wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> What are our plans for our first release under ASF? And who is
>>>>>>>>>>> going
>>>>>>>>>>> to do the release managing?
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best Regards, Edward J. Yoon
>>>>>>>>>>> @eddieyoon
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards, Edward J. Yoon
>>>>>>>>> @eddieyoon
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards, Edward J. Yoon
>>>>>>> @eddieyoon
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards, Edward J. Yoon
>>>>>> @eddieyoon
>>>>>
>>>>>
>>>>
>>
>>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Reply via email to