>
>
> On Thu, Apr 18, 2013 at 9:47 AM, zheyi rong wrote:
>
>> Dear all,
>>
>> I am writing to kindly ask for ideas of doing cartesian product in hadoop.
>> Specifically, now I have two datasets, each of which contains 20million
>> lines.
>> I wan
hen output.
>>>
>>>
>>> Note -- You may not know keys (K1, K2, … , Km) before hand. If yes,
>>> then you need one more pass of dataset1 to identify the keys and store it
>>> to use for dataset2.
>>>
>>>
>>> Regards,
>>> Ajay S
reasonable one, but it will be much slower than methods where you can
prune the comparisons.
On Thu, Apr 18, 2013 at 9:47 AM, zheyi rong wrote:
> Dear all,
>
> I am writing to kindly ask for ideas of doing cartesian product in hadoop.
> Specifically, now I have two datasets, each of wh
@gmail.com>> wrote:
Dear all,
I am writing to kindly ask for ideas of doing cartesian product in hadoop.
Specifically, now I have two datasets, each of which contains 20million lines.
I want to do cartesian product on these two datasets, comparing lines
pairwisely.
The output of each comparis
>
>>
>> On 18-Apr-2013, at 3:51 PM, Azuryy Yu wrote:
>>
>> This is not suitable for his large dataset.
>>
>> --Send from my Sony mobile.
>> On Apr 18, 2013 5:58 PM, "Jagat Singh" wrote:
>>
>>> Hi,
>>>
>>>
obile.
On Apr 18, 2013 5:58 PM, "Jagat Singh"
mailto:jagatsi...@gmail.com>> wrote:
Hi,
Can you have a look at
http://pig.apache.org/docs/r0.11.1/basic.html#cross
Thanks
On Thu, Apr 18, 2013 at 7:47 PM, zheyi rong
mailto:zheyi.r...@gmail.com>> wrote:
Dear all,
I am writing
:
>
>> Hi,
>>
>> Can you have a look at
>>
>> http://pig.apache.org/docs/r0.11.1/basic.html#cross
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 7:47 PM, zheyi rong wrote:
>>
>>> Dear all,
>>>
>>
uot;Jagat Singh"
mailto:jagatsi...@gmail.com>> wrote:
Hi,
Can you have a look at
http://pig.apache.org/docs/r0.11.1/basic.html#cross
Thanks
On Thu, Apr 18, 2013 at 7:47 PM, zheyi rong
mailto:zheyi.r...@gmail.com>> wrote:
Dear all,
I am writing to kindly ask for ideas of d
:47 PM, zheyi rong wrote:
>
>> Dear all,
>>
>> I am writing to kindly ask for ideas of doing cartesian product in hadoop.
>> Specifically, now I have two datasets, each of which contains 20million
>> lines.
>> I want to do cartesian product on these two datasets, c
Hi,
Can you have a look at
http://pig.apache.org/docs/r0.11.1/basic.html#cross
Thanks
On Thu, Apr 18, 2013 at 7:47 PM, zheyi rong wrote:
> Dear all,
>
> I am writing to kindly ask for ideas of doing cartesian product in hadoop.
> Specifically, now I have two datasets, each of wh
Dear all,
I am writing to kindly ask for ideas of doing cartesian product in hadoop.
Specifically, now I have two datasets, each of which contains 20million
lines.
I want to do cartesian product on these two datasets, comparing lines
pairwisely.
The output of each comparison can be mostly
11 matches
Mail list logo