You generally want to do linguistic pre-processing (finding phrases,
synonymizing certain forms such as abbreviations, tokenizing, dropping stop
words, removing boilerplate, removing tables) before doing vectorization.
Altogether, these form pre-processing.
To classify books, you need to
Dear Suresh,
I am also working in Classification of books.
First of all I collect a meta-data of my e-books, after collecting a
meta-data than I start my second level to pre-process an e-book. In
pre-processing, I collect information regarding *books title, chapter
titles sections, subsection
you please tell me which is good and how to use it.?
Thanks,
Suresh
On 16 January 2014 14:49, Saeed Iqbal KhattaK
saeediqbalkhat...@gmail.comwrote:
Dear Suresh,
I am also working in Classification of books.
First of all I collect a meta-data of my e-books, after collecting a
meta-data than
-for-machine-learning-part-1/
http://www.scaleunlimited.com/2013/07/21/text-feature-selection-for-machine-learning-part-2/
-- Ken
On 16 January 2014 14:49, Saeed Iqbal KhattaK
saeediqbalkhat...@gmail.comwrote:
Dear Suresh,
I am also working in Classification of books.
First of all I
Hi,
Our application will be getting books from different users.
We have to classify them accordingly.
Some one please tell me how to do that using apache mahout and java.
Is hadoop necessary for that?
--
Thank Regards
Suresh
Hi,
Can you please tell me what does that pre-processing mean? Is it
vectorization(as explained in Mahout in Action book)
Can it be done using java and Mahout AP ?
And, the model means, is it a class?
On 16 January 2014 11:38, KK R kirubakumar...@gmail.com wrote:
Hi Suresh,
Apache Mahout