See I think Im just misunderstanding how this entity is suppose to be setup ... for example, using the patch on 1.3 I ended up in a loop where .n is never set ...
Feb 2, 2009 1:31:02 PM org.apache.solr.handler.dataimport.HttpDataSource getData INFO: Created URL to: http://subdomain.site.com/feed.rss?page= <entity dataSource="blogs" url=" http://subdomain.site.com/boards.rss?page=${blogs.n}" chunkSize="50" name="docs" pk="link" processor="XPathEntityProcessor" forEach="/rss/channel/item" transformer="RegexTransformer, com.nhl.solr.DateFormatTransformer, TemplateTransformer, com.nhl.solr.EnumeratedEntityTransformer"> I guess what Im looking for is that snippet which shows how it is setup (the initial counter) ... - Jon On Mon, Feb 2, 2009 at 12:39 PM, Noble Paul നോബിള് नोब्ळ् < noble.p...@gmail.com> wrote: > On Mon, Feb 2, 2009 at 11:01 PM, Jon Baer <jonb...@gmail.com> wrote: > > Yes I think what Jared mentions in the JIRA is what I was thinking about > > when it is recommended to always return true for $hasMore ... > > > > "The transformer must know somehow when $hasMore should be true. If the > > transformer always give $hasMore a value "true", will there be infinite > > requests made or will it stop on the first empty request? Using the > > EnumeratedEntityTransformer, a user can specify from the config xml when > > $hasMore should be true using the chunkSize attribute. This solves a > general > > case of "request N rows at a time until no more are available". I agree, > a > > combination of 'rowsFetchedCount' and a HasMoreUntilEmptyTransformer > would > > also make this doable from the configuration" > why cant a Tranformer put a $hasMore=false? > > > > This makes sense. > > > > - Jon > > [ Show » <https://issues.apache.org/jira/browse/SOLR-994> ] > > Jared Flatow< > https://issues.apache.org/jira/secure/ViewProfile.jspa?name=jflatow>- > > 28/Jan/09 > > 09:16 PM The transformer must know somehow when $hasMore should be true. > If > > the transformer always give $hasMore a value "true", will there be > infinite > > requests made or will it stop on the first empty request? Using the > > EnumeratedEntityTransformer, a user can specify from the config xml when > > $hasMore should be true using the chunkSize attribute. This solves a > general > > case of "request N rows at a time until no more are available". I agree, > a > > combination of 'rowsFetchedCount' and a HasMoreUntilEmptyTransformer > would > > also make this doable from the configuration. > > > > On Mon, Feb 2, 2009 at 11:53 AM, Shalin Shekhar Mangar < > > shalinman...@gmail.com> wrote: > > > >> On Mon, Feb 2, 2009 at 9:20 PM, Jon Baer <jonb...@gmail.com> wrote: > >> > >> > Hi, > >> > > >> > Sorry I know this exists ... > >> > > >> > "If an API supports chunking (when the dataset is too large) multiple > >> calls > >> > need to be made to complete the process. XPathEntityprocessor supports > >> this > >> > with a transformer. If transformer returns a row which contains a > field * > >> > $hasMore* with a the value "true" the Processor makes another request > >> with > >> > the same url template (The actual value is recomputed before invoking > ). > >> A > >> > transformer can pass a totally new url too for the next call by > returning > >> a > >> > row which contains a field *$nextUrl* whose value must be the complete > >> url > >> > for the next call." > >> > > >> > But is there a true example of it's use somewhere? Im trying to > figure > >> out > >> > if I know before import that I have 56 "pages" to index how to set > this > >> up > >> > properly. (And how to set it up if pages need to be determined by > >> > something > >> > in the feed, etc). > >> > > >> > >> No, there is no example (yet). You'll put the url with variables for the > >> corresponding 'start' and 'count' parameters and a custom transformer > can > >> specify if another request needs to be made. I know it's not much to go > on. > >> I'll try to write some documentation on the wiki. > >> > >> SOLR-994 might be interesting to you. I haven't been able to look at the > >> patch though. > >> > >> https://issues.apache.org/jira/browse/SOLR-994 > >> -- > >> Regards, > >> Shalin Shekhar Mangar. > >> > > > > > > -- > --Noble Paul >