currently there are no way to define some specific fields that you needs to store in DB. each field is useful in crawl processing.
why do you want to store some specific fields in DB? On Tue, Dec 31, 2013 at 2:13 AM, rk_sharma <[email protected]> wrote: > Hi. > > I am using Nutch-2.1 with mysql as database. My crawling result is correct > and storing multiple fields in DB, my DB field are > > +-------------------+---------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +-------------------+---------------+------+-----+---------+-------+ > | id | varchar(767) | NO | PRI | NULL | | > | headers | blob | YES | | NULL | | > | text | mediumtext | YES | | NULL | | > | status | int(11) | YES | | NULL | | > | markers | blob | YES | | NULL | | > | parseStatus | blob | YES | | NULL | | > | modifiedTime | bigint(20) | YES | | NULL | | > | score | float | YES | | NULL | | > | typ | varchar(32) | YES | | NULL | | > | baseUrl | varchar(767) | YES | | NULL | | > | content | longblob | YES | | NULL | | > | title | varchar(2048) | YES | | NULL | | > | reprUrl | varchar(767) | YES | | NULL | | > | fetchInterval | int(11) | YES | | NULL | | > | prevFetchTime | bigint(20) | YES | | NULL | | > | inlinks | mediumblob | YES | | NULL | | > | prevSignature | blob | YES | | NULL | | > | outlinks | mediumblob | YES | | NULL | | > | fetchTime | bigint(20) | YES | | NULL | | > | retriesSinceFetch | int(11) | YES | | NULL | | > | protocolStatus | blob | YES | | NULL | | > | signature | blob | YES | | NULL | | > | metadata | blob | YES | | NULL | | > +-------------------+---------------+------+-----+---------+-------+ > > is there are any mechanism through which i can remove some column name. or > we can say i need storage of only some specific column. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Store-specific-nutch-output-values-in-database-tp4108762.html > Sent from the Nutch - User mailing list archive at Nabble.com. > -- Don't Grow Old, Grow Up... :-)

