for first time loads i currently post to /update/csv?commit=false&separator=%09&escape=\&stream.file=workfile.txt&map=NULL:&keepEmpty=false", this works well and finishes in about 20 minutes for my work load.
this is mostly cpu bound, i have an 8 core box and it seems one takes the brunt of the work. if i wanted to optimize, would i see any benefit to splitting workfile.txt in two and doing two posts ? im running lucid's build of solr 1.3.0 on jetty 6, io is not a bottleneck as the data folder is on tmpfs thx much --joe