funnelwebcentral.org Forum Index funnelwebcentral.org
The Independent Funnel Web Users Forum
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Crawlers and Bots

 
Post new topic   Reply to topic    funnelwebcentral.org Forum Index -> General
View previous topic :: View next topic  
Author Message
FWnewbie



Joined: 06 Jul 2006
Posts: 6

PostPosted: Thu Oct 26, 2006 2:01 pm    Post subject: Crawlers and Bots Reply with quote

I'm getting a lot of traffic due to crawlers and bots (i.e. msnbot.msn.com , crawler100.ask.com etc.), I need these excluded from my data since I want to report user traffic. Does anyone use filters to exclude the bots or is there another way to go about doing this?

Thanks
Back to top
View user's profile Send private message
dapease
Site Admin


Joined: 31 Jan 2005
Posts: 57

PostPosted: Fri Oct 27, 2006 9:50 am    Post subject: Reply with quote

Well, there was a thread talking about MSN bots specifically, Robot additions not being recognised. But I have found that a particularly good place to start is making sure your FWASettings.txt file is current. Since development was stopped several years ago, there are quite a few search engines and bots that FWA doesn't recognize.

Luckily you can get a really good version from Dan Stouts at Manufactured Environments.

Try that first and see how it does, then we can worry about adding more bots not in the list.

dapease
_________________
One person not willing to let a good product fade away.
Back to top
View user's profile Send private message Send e-mail
FWnewbie



Joined: 06 Jul 2006
Posts: 6

PostPosted: Mon Nov 06, 2006 5:02 pm    Post subject: Reply with quote

After updating the FWASettings.txt file, FWA has picked up on more robots, however still missing many. In the visitor report, msnbot, googlebot and a few others are still dominating. I also find it odd that although FWA is picking up googlebot as a bot, it is still allowing it to go into the visitor report as "crawl-66-249-66-109.googlebot.com", the same goes for msnbot.

Any suggestions?
Back to top
View user's profile Send private message
dapease
Site Admin


Joined: 31 Jan 2005
Posts: 57

PostPosted: Mon Nov 06, 2006 8:41 pm    Post subject: Reply with quote

Would it be possible to see a sample of your log file? I am curious to know if those host names are being stored in the log file or if they are coming from your DNS lookup. It is possible that the names are not consistent with the referenced names in the settings file.

dapease
_________________
One person not willing to let a good product fade away.
Back to top
View user's profile Send private message Send e-mail
FWnewbie



Joined: 06 Jul 2006
Posts: 6

PostPosted: Wed Nov 08, 2006 3:59 pm    Post subject: Reply with quote

Below is a sample straight from my log files for an msnbot.

2006-09-10 15:26:30 192.168.192.52 GET /AM/Template.cfm Section=Practice_Resources&Template=/CM/ContentDisplay.cfm&ContentID=7101 80 - 207.46.98.55 msnbot/1.0+(+http://search.msn.com/msnbot.htm) 200 0 0


Thanks
Back to top
View user's profile Send private message
dapease
Site Admin


Joined: 31 Jan 2005
Posts: 57

PostPosted: Wed Nov 08, 2006 4:19 pm    Post subject: Reply with quote

Ah.

I think I can see why you may be getting odd reports when you run FWA.

What is the web server you are using to serve your content? The reason I ask is your log file does not appear to be in Common Log Format, the format that FWA prefers. Presented with other formats, it tries it's best (it does understand IIS5 and older, for example), but it doesn't always make the best choices and I think that might be causing your frustration here.

Check this thread No referrals and systems/browsers but they ARE there

Though not directly related to your issue, I go over the Common Log Format, which may be of help to you. The way I see it, you either need to change the format in which your web server saves it's logs, OR you need to define your log format manually when running FWA. Honestly, though, I have never had much luck with the latter.

dapease
_________________
One person not willing to let a good product fade away.
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic   Reply to topic    funnelwebcentral.org Forum Index -> General All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group