Re: [dspace-tech] server flooded by search requests

2017-10-25 Thread Francis Brouns
Dear Monica,

we have enabled sitemaps for quite some time now and did not make any 
changes to the server recently. Robots.txt is also present.

The problem is that this particular request targets the same community over 
and over again, with requests that differ only slightly. These requests 
occur 15-20 times per second.

kind regards,
Francis Brouns

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] server flooded by search requests

2017-10-25 Thread Monika Mevenkamp
Francis 

66.249.76.34 is a googlebot IP

you should enable sitemaps, googlebots honor them 

have a look at Search Engine Optimization 
  
documentation 

Monika




 
Monika Mevenkamp
mo.me...@gmail.com

http://mo-meven.tumblr.com/
http://mcmprogramming.com/mo.meven/



> On Oct 25, 2017, at 9:39 AM, Francis Brouns  wrote:
> 
> Hi all,
> 
> our DSpace servers is being flooded with search request since the beginning 
> of October. Normally we get about 25 search requests in a month, now we 
> get 2.5 million in 2 weeks. It seems that these requests are all aimed at a 
> particular Community and are searching for combination of authors and 
> subjects over and over. Most of the time these search request have no results.
> 
> Running DSpace 5.4 on SLES Linux, tomcat 7, java 7, jspui
> 
> In the dspace log, I find numerous requests like these: 
> ip_addr=66.249.76.34:search:scope=org.dspace.content.Community@287,query="null",results=(0,0,0)
> 
> in tomcat localhost-access log
> 66.249.76.34  - - [24/Oct/2017:01:05:30 +0200] "GET 
> /handle/1820/2145/simple-search?location=1820%2F2145=_field_1=dateIssued_type_1=equals_value_1=2011_field_2=author_type_2=equals_value_2=Van+Hooft%2C+W.+F._field_3=author_type_3=equals_value_3=Leirs%2C+H._field_4=author_type_4=equals_value_4=Bauer%2C+H._field_5=subject_type_5=equals_value_5=phylogeography_field_6=author_type_6=equals_value_6=Van+Haeringen%2C+W.+A._field_7=author_type_7=equals_value_7=Bertola%2C+L.+D._field_8=subject_type_8=equals_value_8=evolutionary+history_field_9=author_type_9=equals_value_9=Tumenta%2C+P.+N._field_10=author_type_10=equals_value_10=York%2C+D.+S._field_11=subject_type_11=equals_value_11=Panthera+leo=5_by=dc.title_sort=DESC=0
>  HTTP/1.1" 200 30123 - /handle/1820/2145/simple-search
> - 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
> /solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=subject_keyword%3ASubsidiarity=subject_keyword%3ACollaborative%5C+Learning=subject_keyword%3AVirtual%5C+Campus=dateIssued_keyword%3A2009=subject_keyword%3AOrganizational%5C+Model=subject_keyword%3ALearning%5C+for%5C+Sustainable%5C+Development=subject_keyword%3AVirtual%5C+Mobility=subject_keyword%3ANetworked%5C+Learning=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+asc=javabin=2
>  HTTP/1.1" 200 611 - /solr/search/select
> - 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
> /solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=subject_keyword%3ASubsidiarity=subject_keyword%3ACollaborative%5C+Learning=subject_keyword%3AVirtual%5C+Campus=dateIssued_keyword%3A2009=subject_keyword%3AOrganizational%5C+Model=subject_keyword%3ALearning%5C+for%5C+Sustainable%5C+Development=subject_keyword%3AVirtual%5C+Mobility=subject_keyword%3ANetworked%5C+Learning=location%3Am18=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+desc=javabin=2
>  HTTP/1.1" 200 625 - /solr/search/select
> - 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
> /solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=dateIssued_keyword%3A2011=subject_keyword%3Alion=author_keyword%3ASogbohossou%2C%5C+E.=author_keyword%3AVan%5C+Haeringen%2C%5C+W.%5C+A.=author_keyword%3AVan%5C+Hooft%2C%5C+W.%5C+F.=subject_keyword%3AWest%5C+Africa=subject_keyword%3Aphylogenetics=author_keyword%3APrins%2C%5C+H.%5C+H.%5C+T.=author_keyword%3AYork%2C%5C+D.%5C+S.=author_keyword%3AUit%5C+de%5C+Weerd%2C%5C+D.%5C+R.=author_keyword%3AFunston%2C%5C+P.%5C+J.=subject_keyword%3Aevolutionary%5C+history=author_keyword%3AUdo%5C+de%5C+Haes%2C%5C+H.%5C+A.=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+asc=javabin=2
>  HTTP/1.1" 200 740 - /solr/search/select
> - 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
> /solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=dateIssued_keyword%3A2011=subject_keyword%3Alion=author_keyword%3ASogbohossou%2C%5C+E.=author_keyword%3AVan%5C+Haeringen%2C%5C+W.%5C+A.=author_keyword%3AVan%5C+Hooft%2C%5C+W.%5C+F.=subject_keyword%3AWest%5C+Africa=subject_keyword%3Aphylogenetics=author_keyword%3APrins%2C%5C+H.%5C+H.%5C+T.=author_keyword%3AYork%2C%5C+D.%5C+S.=author_keyword%3AUit%5C+de%5C+Weerd%2C%5C+D.%5C+R.=author_keyword%3AFunston%2C%5C+P.%5C+J.=subject_keyword%3Aevolutionary%5C+history=author_keyword%3AUdo%5C+de%5C+Haes%2C%5C+H.%5C+A.=location%3Am18=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+desc=javabin=2
>  HTTP/1.1" 200 758 - 

[dspace-tech] server flooded by search requests

2017-10-25 Thread Francis Brouns
Hi all,

our DSpace servers is being flooded with search request since the beginning 
of October. Normally we get about 25 search requests in a month, now we 
get 2.5 million in 2 weeks. It seems that these requests are all aimed at a 
particular Community and are searching for combination of authors and 
subjects over and over. Most of the time these search request have no 
results.

Running DSpace 5.4 on SLES Linux, tomcat 7, java 7, jspui

In the dspace log, I find numerous requests like these: 
ip_addr=66.249.76.34:search:scope=org.dspace.content.Community@287,query="null",results=(0,0,0)

in tomcat localhost-access log
66.249.76.34  - - [24/Oct/2017:01:05:30 +0200] "GET 
/handle/1820/2145/simple-search?location=1820%2F2145=_field_1=dateIssued_type_1=equals_value_1=2011_field_2=author_type_2=equals_value_2=Van+Hooft%2C+W.+F._field_3=author_type_3=equals_value_3=Leirs%2C+H._field_4=author_type_4=equals_value_4=Bauer%2C+H._field_5=subject_type_5=equals_value_5=phylogeography_field_6=author_type_6=equals_value_6=Van+Haeringen%2C+W.+A._field_7=author_type_7=equals_value_7=Bertola%2C+L.+D._field_8=subject_type_8=equals_value_8=evolutionary+history_field_9=author_type_9=equals_value_9=Tumenta%2C+P.+N._field_10=author_type_10=equals_value_10=York%2C+D.+S._field_11=subject_type_11=equals_value_11=Panthera+leo=5_by=dc.title_sort=DESC=0
 
HTTP/1.1" 200 30123 - /handle/1820/2145/simple-search
- 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
/solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=subject_keyword%3ASubsidiarity=subject_keyword%3ACollaborative%5C+Learning=subject_keyword%3AVirtual%5C+Campus=dateIssued_keyword%3A2009=subject_keyword%3AOrganizational%5C+Model=subject_keyword%3ALearning%5C+for%5C+Sustainable%5C+Development=subject_keyword%3AVirtual%5C+Mobility=subject_keyword%3ANetworked%5C+Learning=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+asc=javabin=2
 
HTTP/1.1" 200 611 - /solr/search/select
- 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
/solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=subject_keyword%3ASubsidiarity=subject_keyword%3ACollaborative%5C+Learning=subject_keyword%3AVirtual%5C+Campus=dateIssued_keyword%3A2009=subject_keyword%3AOrganizational%5C+Model=subject_keyword%3ALearning%5C+for%5C+Sustainable%5C+Development=subject_keyword%3AVirtual%5C+Mobility=subject_keyword%3ANetworked%5C+Learning=location%3Am18=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+desc=javabin=2
 
HTTP/1.1" 200 625 - /solr/search/select
- 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
/solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=dateIssued_keyword%3A2011=subject_keyword%3Alion=author_keyword%3ASogbohossou%2C%5C+E.=author_keyword%3AVan%5C+Haeringen%2C%5C+W.%5C+A.=author_keyword%3AVan%5C+Hooft%2C%5C+W.%5C+F.=subject_keyword%3AWest%5C+Africa=subject_keyword%3Aphylogenetics=author_keyword%3APrins%2C%5C+H.%5C+H.%5C+T.=author_keyword%3AYork%2C%5C+D.%5C+S.=author_keyword%3AUit%5C+de%5C+Weerd%2C%5C+D.%5C+R.=author_keyword%3AFunston%2C%5C+P.%5C+J.=subject_keyword%3Aevolutionary%5C+history=author_keyword%3AUdo%5C+de%5C+Haes%2C%5C+H.%5C+A.=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+asc=javabin=2
 
HTTP/1.1" 200 740 - /solr/search/select
- 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET 
/solr/search/select?q=*%3A*=dateIssued.year%2Chandle%2Csearch.resourcetype%2Csearch.resourceid=NOT%28withdrawn%3Atrue%29=NOT%28discoverable%3Afalse%29=dateIssued_keyword%3A2011=subject_keyword%3Alion=author_keyword%3ASogbohossou%2C%5C+E.=author_keyword%3AVan%5C+Haeringen%2C%5C+W.%5C+A.=author_keyword%3AVan%5C+Hooft%2C%5C+W.%5C+F.=subject_keyword%3AWest%5C+Africa=subject_keyword%3Aphylogenetics=author_keyword%3APrins%2C%5C+H.%5C+H.%5C+T.=author_keyword%3AYork%2C%5C+D.%5C+S.=author_keyword%3AUit%5C+de%5C+Weerd%2C%5C+D.%5C+R.=author_keyword%3AFunston%2C%5C+P.%5C+J.=subject_keyword%3Aevolutionary%5C+history=author_keyword%3AUdo%5C+de%5C+Haes%2C%5C+H.%5C+A.=location%3Am18=location%3Am18=dateIssued.year%3A%5B*+TO+*%5D=read%3A%28g0+OR+g0%29=0=1=dateIssued.year_sort+desc=javabin=2
 
HTTP/1.1" 200 758 - /solr/search/select
- 127.0.0.1 - - [24/Oct/2017:01:05:30 +0200] "GET