when i gone through different Repos for spam data i am only getting MB
files .
To check in hadoop we need a large file right.
I need to test my hadoop svm implementation.I gone through
http://archive.ics.uci.edu/ml/machine-learning-databases/spambase/ .But the
dataset is of only 700KB or something.I need similar dataset.


On Sat, Nov 23, 2013 at 8:35 AM, unmesha sreeveni <unmeshab...@gmail.com>wrote:

> Thanks Devin :) That was a nice explanation.
>
>
> On Fri, Nov 22, 2013 at 6:20 PM, Devin Suiter RDX <dsui...@rdx.com> wrote:
>
>> They are both for machine learning. Classification is known as
>> "supervised learning" where you feed the engine data of known patterns and
>> instruct it what are the key nodes. Clustering is "unsupervised learning"
>> where you allow the algorithm to "guess" at what is significant in the
>> correlations picked up by the algorithm. Spam filtering is a popular
>> example of classification, and image indexing is a popular example of
>> clustering. It is mainly used on Hadoop because when it comes to machine
>> learning, the more data that passes through the algorithm the more accurate
>> it should be, and Hadoop can handle large data better than anything else
>> around at the moment.
>>
>> *Devin Suiter*
>> Jr. Data Solutions Software Engineer
>> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
>> Google Voice: 412-256-8556 | www.rdx.com
>>
>>
>> On Fri, Nov 22, 2013 at 2:54 AM, unmesha sreeveni 
>> <unmeshab...@gmail.com>wrote:
>>
>>> what is the differences b/w classification algorithms and clustering
>>> algorithms in hadoop?
>>>
>>>
>>> --
>>> *Thanks & Regards*
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> *Junior Developer*
>>>
>>>
>>>
>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
>
> *Junior Developer*
>
>
>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B

*Junior Developer*

Reply via email to