Lee, note that that's TIME_WAIT *not* TIMED_WAIT, and there's no need to check the server, just the client. Any TIME_WAIT sockets will disappear fairly quickly: the best way to check is during the test itself.

Have you thought about garbage collection? The fact that the error occurs after a given number of inserts is suggestive. If the GC thread locks everything else off the CPU for a long enough period of time, the server will time out connections. This is especially likely to happen if the program working set is a large percentage of the java heap size (which may in turn indicate either leaks that you could fix, or a need for a larger heap).

You might try instrumenting your code to report insert times in Java, and also report the elapsed time when you see exceptions. Then monitor the java process size as your program runs. You may be able to correlate longer elapsed times for inserts with GC events, which would tend to confirm this hypothesis.

When using RecordLoader, XQSync, and Corb with large content sets, I generally use the -XX:+UseConcMarkSweepGC VM option. Sometimes I also raise the max heap size, but some care is required because too large of a heap seems to slow things down.

If GC and memory turns out to be involved, I also recommend looking at the whole Java program carefully, with an eye toward minimizing memory utilization and especially toward removing any object leaks. If you are leaking objects, then memory will fill up sooner or later no matter what GC does. There are some good Java profiling tools available to help with this.

-- Mike

On 2010-03-15 07:08, Lee, David wrote:
Thanks Ron, I'm doing all the things you suggest already
1) Reusing a Session
2) bundling 20 files in 1 insertContent()
3) Checked netstat and there are no TIMED_WAIT connections on either
client or server

I'm trying something different this time which is to use a thread pool
to try to increase effeciency
of sending the files.  Maybe this will be worse on the system I dont
know.
Maybe there is some kind of maximum session open time ?
The error occur about 2 hours into the transfer typically.
I could try closing and reopening the session every hour ...

-David
Server: 4.1-4 on Fedorah fc 11
Client: XP/Pro SP3 and Windows 7
XCC: Latest from download


-----Original Message-----
From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Ron
Hitchens
Sent: Monday, March 15, 2010 4:49 AM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev
General]ServerConnectionException-consistantly after about 20, 000 files


    You may be filling up your OS's file table.  When a socket
is closed, the OS holds onto it for a while (default usually
about two minutes) to reliably detect any straggler packets.

    When you cycle a lot of connection quickly, this can max out
internal data structures in the OS.  If you do a netstat and
see zillions of connections in TIME_WAIT state, that's probably
what's happening.

    If you're connecting across a LAN, this delay is not really
needed, because it's hard for packet to get rerouted anywhere else.
You can tune the socket wait time down to 5 seconds or so and that
will allow file table slots to be re-used more quickly.

    You can also insert multiple documents per request, all of
which will be transferred together and result in fewer low-level
sockets being opened.

On Mar 14, 2010, at 9:28 PM, Lee, David wrote:

FYI, here's a stack trace from the same program but in this case its
the query component under load.
This is very consistent as well after about 10 -20k requests


com.marklogic.xcc.exceptions.ServerConnectionException: Error parsing
HTTP headers: Premature EOF, partial header line read: ''
  [Session: user=DLEE, cb={default} [ContentSource: user=DLEE,
cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]]
                 at
com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
tractRequestController.java:99)
                 at
com.marklogic.xcc.impl.SessionImpl.submitRequest(SessionImpl.java:280)
                 at org.xmlsh.marklogic.put.setChecksum(put.java:341)
                 at org.xmlsh.marklogic.put.flushContent(put.java:315)
                 at org.xmlsh.marklogic.put.putContent(put.java:288)
                 at org.xmlsh.marklogic.put.load(put.java:272)
                 at org.xmlsh.marklogic.put.load(put.java:266)
                 at org.xmlsh.marklogic.put.run(put.java:126)
                 at org.xmlsh.core.XCommand.run(XCommand.java:86)
                 at org.xmlsh.core.XCommand.run(XCommand.java:63)
                 at
org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
                 at
org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
                 at
org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
                 at
org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
                 at
org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at
org.xmlsh.sh.shell.Shell.interactive(Shell.java:461)
                 at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82)
                 at
org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54)
                 at org.xmlsh.sh.shell.Shell.main(Shell.java:690)
Caused by: java.io.IOException: Error parsing HTTP headers: Premature
EOF, partial header line read: ''
                 at
com.marklogic.http.HttpHeaders.nextHeaderLine(HttpHeaders.java:326)
                 at
com.marklogic.http.HttpHeaders.parseResponseHeaders(HttpHeaders.java:287
)
                 at
com.marklogic.http.HttpChannel.parseHeaders(HttpChannel.java:323)
                 at
com.marklogic.http.HttpChannel.receiveMode(HttpChannel.java:293)
                 at
com.marklogic.http.HttpChannel.getResponseCode(HttpChannel.java:187)
                 at
com.marklogic.xcc.impl.handlers.EvalRequestController.issueRequest(EvalR
equestController.java:111)
                 at
com.marklogic.xcc.impl.handlers.EvalRequestController.serverDialog(EvalR
equestController.java:62)
                 at
com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
tractRequestController.java:72)
                 ... 29 more



From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, David
Sent: Saturday, March 13, 2010 7:42 PM
To: General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General]
ServerConnectionException-consistantly after about 20, 000 files

Here's a full stack trace, including my code in the stack.
by "opening connections" I mean calling

      URI serverUri = new URI (connect);
      ContentSource cs = ContentSourceFactory.newContentSource
(serverUri);

for ever file instead of reusing the ContentSource for all files.
Although that may be a red-herring ... when I do it that way (new
Content Source for each file) I'm not aborting the push operation if one
file fails so I may be missing these errors in that case.

--------- Stack Trace



2010-03-13 16:17:13,748 12310138 ERROR [main] core.SimpleCommand -
Exception running command: ml:put
com.marklogic.xcc.exceptions.ServerConnectionException: An established
connection was aborted by the software in your host machine
  [Session: user=DLEE, cb={default} [ContentSource: user=DLEE,
cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]]
                 at
com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
tractRequestController.java:99)
                 at
com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:204)
                 at org.xmlsh.marklogic.put.load(put.java:180)
                 at org.xmlsh.marklogic.put.load(put.java:171)
                 at org.xmlsh.marklogic.put.run(put.java:99)
                 at org.xmlsh.core.XCommand.run(XCommand.java:86)
                 at org.xmlsh.core.XCommand.run(XCommand.java:63)
                 at
org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
                 at
org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
                 at
org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
                 at
org.xmlsh.sh.shell.Shell.interactive(Shell.java:461)
                 at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82)
                 at
org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54)
                 at org.xmlsh.sh.shell.Shell.main(Shell.java:690)
Caused by: java.io.IOException: An established connection was aborted
by the software in your host machine
                 at sun.nio.ch.SocketDispatcher.write0(Native Method)
                 at
sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
                 at
sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
                 at sun.nio.ch.IOUtil.write(IOUtil.java:60)
                 at
sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
                 at
com.marklogic.http.HttpChannel.writeBuffer(HttpChannel.java:373)
                 at
com.marklogic.http.HttpChannel.writeBody(HttpChannel.java:353)
                 at
com.marklogic.http.HttpChannel.flushRequest(HttpChannel.java:346)
                 at
com.marklogic.http.HttpChannel.write(HttpChannel.java:134)
                 at
com.marklogic.xcc.impl.handlers.ContentInsertController.writeChunkHeader
(ContentInsertController.java:299)
                 at
com.marklogic.xcc.impl.handlers.ContentInsertController.issueRequest(Con
tentInsertController.java:210)
                 at
com.marklogic.xcc.impl.handlers.ContentInsertController.serverDialog(Con
tentInsertController.java:112)
                 at
com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(Abs
tractRequestController.java:72)
                 ... 20 more





From: general-boun...@developer.marklogic.com
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Sam Neth
Sent: Saturday, March 13, 2010 6:08 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] ServerConnectionException
-consistantly after about 20, 000 files

Could you post a stack trace?

What version of XCC are you using?

What specifically are you referring to when you talk about "opening
connections"?

On Mar 13, 2010, at 2:33 PM, Lee, David wrote:


If I use XCC to iteratively insert a large set of documents I
consistently get this error

com.marklogic.xcc.exceptions.ServerConnectionException: An established
connectin was aborted by the software in your host machine [Session:
user=DLEE, cb={default} [ContentSource: user=DLEE, cb={none} [providr:
address=home/192.168.1.10:8011, pool=0/64]]]


This occurs after about 20,000 files and aborts the program.
I'm thinking of implementing a exception handler to retry but I dont
want to be retrying after more serious errors.
The server log doesnt show any problems, and this is on a dedicated
1GB wired LAN so I dont think its internet problems.

If instead of using the same connection I open the connection for each
file it often gets around this problem, but not always,
I think its getting around it because I'm not aborting on error in
that case (just going to the next file).

I'm using this code snippet to create the content in bulks of 1-20 (
files in a directory )

Content content= ContentFactory.newContent (uri, file,
mCreateOptions);
contents.add(content);
...

if( ! contents.isEmpty() )
      session.insertContent (contents.toArray(new Content[
contents.size()]));



Any suggestions ?



----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
d...@epocrates.com
812-482-5224

_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

---
Ron Hitchens {mailto:r...@ronsoft.com}   Ronsoft Technologies
      (650) 766-2355 (Home Office)       http://www.ronsoft.com
      (707) 924-3878 (fax)               Bit Twiddling At Its Finest
"No amount of belief establishes any fact." -Unknown





_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

Reply via email to