I wanted to have a closer look at one of the files that claims to have less output:

govdocs1/477/477727.ppt         MBD0104A5C8.doc

so I ran tika-app 2.8.0 and 2.8.1 snapshot and got the same output. But according to the excel table I should have these more in B (TOP_10_MORE_IN_B column):

and: 3 | land: 3 | micro: 3 | reuse: 3 | sprinkler: 3 | water: 3 | capture: 2 | collect: 2 | considered: 2 | leveled: 2

I can't find "sprinkler" in the text below nor in the WORD file.

<?xml version="1.0" encoding="UTF-8"?><html xmlns="http://www.w3.org/1999/xhtml";>
<head>
<meta name="cp:revision" content="3"/>
<meta name="meta:word-count" content="322"/>
<meta name="extended-properties:Application" content="Microsoft Word 10.0"/>
<meta name="meta:last-author" content="Carolyn.Jones"/>
<meta name="dc:creator" content="NWCC"/>
<meta name="extended-properties:Company" content="USDA"/>
<meta name="xmpTPg:NPages" content="1"/>
<meta name="resourceName" content="MBD0104A5C8.doc"/>
<meta name="dcterms:created" content="2005-01-06T20:39:00Z"/>
<meta name="dcterms:modified" content="2005-03-29T22:02:00Z"/>
<meta name="meta:character-count" content="1836"/>
<meta name="extended-properties:Template" content="Normal.dot"/>
<meta name="X-TIKA:Parsed-By" content="org.apache.tika.parser.DefaultParser"/> <meta name="X-TIKA:Parsed-By" content="org.apache.tika.parser.microsoft.OfficeParser"/>
<meta name="extended-properties:TotalTime" content="3600000000"/>
<meta name="Content-Length" content="22528"/>
<meta name="meta:page-count" content="1"/>
<meta name="Content-Type" content="application/msword"/>
<title/>
</head>
<body><p><b>Conservation Security Program (CSP)
</b></p>
<p><b>Irrigation Enhancement Index Tool</b></p>
<p>This tool is designed to help landowners conduct a self assessment of their eligibility for payment for enhanced irrigation systems in the Conservation Security Program.  It may also serve as a means of documenting irrigation system components that can be utilized during individual interviews.
</p>
<p>This procedure is to be utilized on irrigated lands eligible for CSP and will result in assigning an Irrigation Enhancement Index value to the irrigation system being evaluated.
</p>
<p>This procedure starts with a base value that is assigned to the specific type of irrigation system in use.  Systems that commonly have higher irrigation efficiencies and/or are easier to manage are assigned higher values.  Modifiers are applied based on the level of management and the efficiency of the on-farm water delivery system.  A bonus is given if runoff from the irrigated field is captured for re-use.</p> <p>The final calculation will require a value of the Soil Condition Index (SCI) multiplier.  The exact value of this multiplier will be provided to you when NRCS staff computes your final SCI during your interview.  The multiplier will be a value from 0.9 to 1.0 depending on your SCI.</p> <p>This self assessment is simple and should take less than 5 minutes to complete.  A basic hand calculator is recommended. In addition, basic knowledge of the irrigation system and management practices in use is necessary.  Definitions of the various terms are included in this tool.</p> <p>When the self assessment is complete, the landowner will have calculated an Irrigation Enhancement Index value for the irrigation system.  The Irrigation Enhancement Index is not an efficiency number, but rather an indicator of how well the system may perform.  If the Irrigation Enhancement Index value is 50 or more, the landowner may be eligible for CSP payments.  If the Irrigation Enhancement Index value is less than 50, the applicant should consider utilizing other USDA programs to improve the irrigation system.  If the Irrigation Enhancement Index is 60 or greater, the applicant may be eligible for increased payments.
</p>
</body></html>

Reply via email to