Thanks Hoss, The issue mentioned describes a similar behavior to what I observed, but not quite. Commons-fileupload creates java.io.File objects for the temp files, and when those Files are garbage collected, the temp file is deleted. I've verified this by letting the temp files build up and then forcing a full collection which clears all of them. So I think the reason a percentage of temp files built up in my system was that under heavy load, some of the java.io.Files made it into old gen in the heap. I switched to G1, and the problem went away.
Regarding the how the XML files are being sent, I have verified that each XML file is sent as a single request, by aligning the access log of my Solr master server with the processing log of my SolrJ server. I didn't test the requests to see if the MIME type is multipart, but I suppose it is possible if some other form data or instruction needed to be passed with it. Either way, I suppose it would go through fileupload anyway, because somebody's got to make a temp file for large files, right? Ryan ________________________________________ From: Chris Hostetter [hossman_luc...@fucit.org] Sent: Wednesday, January 16, 2013 6:06 PM To: solr-user@lucene.apache.org Subject: RE: SolrJ DirectXmlRequest : DirectXmlRequest is part of the SolrJ library, so I guess that means it : is not commonly used. My use case is that I'm applying an XSLT to the : raw XML on the client side, instead of leaving that up to the Solr : master (although even if I applied the XSLT on the Solr server, I'd I think Otis's point was that most people don't have Solr XML files lying arround that they send to Solr, nor do they build up XML strings in Java in the Solr input format (with XSLT or otherwise) ... most people using SolrJ build up SolrInputDocument objects and pass those to their SolrServer instance. : I've done some research and I'm fairly confident that apache : commons-fileupload library is responsible for the temp files. There's I believe you are correct ... searching for "solr fileupload temp files" lead me to this issue which seems to have fallen by the way side... https://issues.apache.org/jira/browse/SOLR-1953 ...if you could try that patch outand/or post your comments it would be helpful. Something that seems really odd to me however is how/why your basic updates are even causing multipart/file-upload functionality to be used ... a quick skim of the client code suggests that that should only happen if your try to send multiple ContentStreams in a single request: I can understand why that wouldn't typically happen for most users building up multiple SolrInputDocuments (they would get added to a single stream); and i can understand why that would typically happen for users sending multiple binary files to something like ExtractingRequestHandler -- but if you are using DirectXmlRequest in the way you described each xml file should be sent as a single stream in a single request and the XML should be sent in the raw POST body -- the commons-fileupload code shouldn't even come into play. (either that, or i'm missing something, or you're using an older version of solr that used fileupload even if there was only a single content stream) -Hoss --------------------------------------------------------------------- This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful.