Welcome, Guest
Username: Password: Remember me
1. The "search..." box above searches the Docs & Forum Posts. The "Search" tab above just searches the Forum Posts. :side:
Please use these to search for your issue *before* creating a new message topic, as your issue may have been previously solved.
2. Please put your Club # and Club Web Address in your Forum Signature (best) OR in each post to get faster support from us.
Click here to edit your signature at the bottom of the Profile Information tab.
3. Our user and admin docs are available at: support.toastmastersclubs.org/doc "There's a doc for that!" ;)
4. There is an "Opt In" Feature for newly added members. The Opt In document explains the strikethrough member information. Click Here to View the Post
5. When posting a New Topic , please include all relevant details and be specific. When did your issue 1st occur? What operating system, browser, & browser version are you using? Did you refresh your browser cache? Are your cookies enabled? Lastly, a screen shot is often helpful.
6. Please abide by the Terms of Use . We are volunteers contributing our spare time. We are happy to assist you, so long as you are respectful and courteous.
7. We are always looking for new FreeToastHost Ambassadors to join our team and support fellow Toastmasters in their use of the FreeToastHost website system. If you are familiar with the system and have some interest, send a Send Us a Private Message.
  • Page:
  • 1

TOPIC:

Regarding the club websites 7 years 6 months ago #58900

  • pratik411
  • pratik411's Avatar Topic Author
  • Offline
  • New Member
  • New Member
  • Posts: 2
  • Thank you received: 0
Hi,

I was going through few club websites and came across a robots.txt file which is used by the search engines to index a website. Why do we have a Disallow for all the files ?

User-agent: *
Disallow: /

This blocks the entire website from getting indexed in google. Can I change it myself for the club website?

Thanks
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58904

  • SteveTheTechie
  • SteveTheTechie's Avatar
  • Offline
  • FreeToastHost Developer
  • FreeToastHost Developer
  • Posts: 13529
  • Thank you received: 3831
We also use Allows that will allow indexing on specific folders that we deem important.

You cannot change that as it directly impacts overall system performance. We are not going to grant access to that.
Regards,

Steve James, DTM
FreeToastHost System Developer
Officer Emeritus, Mindful Communicators (Club 1966, District 52) A President's Distinguished Club for each of the last 10 years.

>>> Please put your club number in your forum profile. CLICK here to edit your profile.
Last edit: by SteveTheTechie.
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58906

  • pratik411
  • pratik411's Avatar Topic Author
  • Offline
  • New Member
  • New Member
  • Posts: 2
  • Thank you received: 0
Hi Sir,

Thanks for the reply.

Kindly find below my robots.txt file. Having the below robots file won't facilitate indexing as there is no
allow
parameter specified.

Help appreciated.
User-agent: Googlebot
Crawl-delay: 10 
Disallow: /jquery/
Disallow: /fthadmin/
Disallow: /logfiles/
Disallow: /OLD_FILES/
Disallow: /text/
Disallow: /js/ckeditor_smileys/
Disallow: /js/
Disallow: /json/

User-agent: bingbot
Crawl-delay: 10 
Disallow: /jquery/
Disallow: /fthadmin/
Disallow: /logfiles/
Disallow: /OLD_FILES/
Disallow: /text/
Disallow: /js/
Disallow: /json/

User-agent: MSNBot
Crawl-delay: 10
Disallow: /jquery/
Disallow: /fthadmin/
Disallow: /logfiles/
Disallow: /OLD_FILES/
Disallow: /text/
Disallow: /js/
Disallow: /json/

User-agent: Slurp 
Crawl-delay: 10
Disallow: /jquery/
Disallow: /fthadmin/
Disallow: /logfiles/
Disallow: /OLD_FILES/
Disallow: /text/
Disallow: /js/
Disallow: /json/

User-agent: *
Disallow: /
Crawl-delay: 60

User-agent: magpie-crawler
Disallow: /

User-agent: rogerbot
Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: stremorbot
Disallow: /

User-agent: YandexBot
Disallow: /

User-agent: Ezooms
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: SeznamBot
Disallow: /

User-agent: Baiduspider
Disallow: /

User-agent: Sosospider
Disallow: /

User-agent: Sosospider+
Disallow: /

User-agent: wonderbot/JS 1.0
Disallow: /
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58907

  • GeorgeMarshall
  • GeorgeMarshall's Avatar
  • Visitor
  • Visitor
Google is certainly indexing clubs on FTH.
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58908

  • SteveTheTechie
  • SteveTheTechie's Avatar
  • Offline
  • FreeToastHost Developer
  • FreeToastHost Developer
  • Posts: 13529
  • Thank you received: 3831

Kindly find below my robots.txt file.


Ok, thanks, I will look into that when I get a chance this evening. We may have temporarily tweaked it a few months ago during the officer changeover when system performance was suffering.

However, keep in mind the following:

It is not your robots.txt file, per se. Any idea that that FTH websites are totally independent of one another and having their own robots.txt file is a bit of an illusion that the system promotes. The system is essentially really only one website (template) with many "personalities" (content), each one which results in "an individual club website".

We manage the robots.txt file(s) for the entire system to keep the *system* performance optimal.
Regards,

Steve James, DTM
FreeToastHost System Developer
Officer Emeritus, Mindful Communicators (Club 1966, District 52) A President's Distinguished Club for each of the last 10 years.

>>> Please put your club number in your forum profile. CLICK here to edit your profile.
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58909

  • SteveTheTechie
  • SteveTheTechie's Avatar
  • Offline
  • FreeToastHost Developer
  • FreeToastHost Developer
  • Posts: 13529
  • Thank you received: 3831

Google is certainly indexing clubs on FTH.


Understood. However, as Brian will confirm, we have had instances where bad bots have just brought the system performance "to its knees" (not googlebot), so we tend to be very picky about what bots we allow and where we allow them to venture. In any case, in some cases, robots.txt is ignored anyway.

We do want to enable SEO, but we tend to focus on the major search providers.

I would refer you to the following article on semalt as an example: www.incapsula.com/blog/semalt-botnet-spam.html This *killed* the system performance a few years ago.

It is probably time for us to tweak our robots.txt file anyway for current bad bots, so this thread is probably a good thing. www.botreports.com/badbots/

I know that Google is looking at more parts of the websites (including javascript), so we may need to relax that a bit. However, I will be very wary of any resulting system performance impacts.
Regards,

Steve James, DTM
FreeToastHost System Developer
Officer Emeritus, Mindful Communicators (Club 1966, District 52) A President's Distinguished Club for each of the last 10 years.

>>> Please put your club number in your forum profile. CLICK here to edit your profile.
Last edit: by SteveTheTechie.
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58911

  • SteveTheTechie
  • SteveTheTechie's Avatar
  • Offline
  • FreeToastHost Developer
  • FreeToastHost Developer
  • Posts: 13529
  • Thank you received: 3831
Ok, i took out:
User-agent: *
Disallow: /
Crawl-delay: 60

Thanks for the heads-up! B) Given that the last changed file date was 7/6/2016, I think it was changed around the officer changeover time as previously noted.

There is still room for some discussion about whether we should be disallowing js folders or not. I know that Google wants to index javascript... I am just a bit uncertain about it as far as potential system performance impact. It is a single server system running a MySQL database for 11,000+ clubs/districts. We do a number of things to keep it running smoothly, but it does not take that much to max it out.
Regards,

Steve James, DTM
FreeToastHost System Developer
Officer Emeritus, Mindful Communicators (Club 1966, District 52) A President's Distinguished Club for each of the last 10 years.

>>> Please put your club number in your forum profile. CLICK here to edit your profile.
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58912

  • GeorgeMarshall
  • GeorgeMarshall's Avatar
  • Visitor
  • Visitor
Is there really any reason why bots should be allowed in index JS? Is there any club-specific public data there?
The topic has been locked.

Regarding the club websites 7 years 6 months ago #58913

  • SteveTheTechie
  • SteveTheTechie's Avatar
  • Offline
  • FreeToastHost Developer
  • FreeToastHost Developer
  • Posts: 13529
  • Thank you received: 3831

Is there really any reason why bots should be allowed in index JS? Is there any club-specific public data there?


No.

The following info looks relevant and interesting though:
searchengineland.com/tested-googlebot-cr...heres-learned-220157

The key thing is that the website template is filled server side not via Javascript. Could do some template filling on AJAX calls in the future, but nothing now. (do a little of that in a few places now but not much)

However, there are some Javascript redirects, as mentioned in the article.

How much of this is important?... not sure.
Regards,

Steve James, DTM
FreeToastHost System Developer
Officer Emeritus, Mindful Communicators (Club 1966, District 52) A President's Distinguished Club for each of the last 10 years.

>>> Please put your club number in your forum profile. CLICK here to edit your profile.
The following user(s) said Thank You: GeorgeMarshall
Last edit: by SteveTheTechie.
The topic has been locked.
  • Page:
  • 1
Moderators: GeorgeMarshallPamrhtaylor3jliumarc33NotLiabledeedubbleyooNSBPhyllis Kirouac
Time to create page: 0.060 seconds
Powered by Kunena Forum