This article goes over all the different options for setting up web snapshots in your archive.
Article Navigation
- Checking your Website Compatibility
- Adding your Website Sitemap for Capture
- Adding Specific URLs
- Dynamic Option
- Turning Off Specific URLs
- Article Glossary
Not finding what you are looking for? View supporting articles.
Checking Your Website Sitemap Compatibility.
- We use .XML sitemaps to capture. Please check with your web host or Information Technology (IT) team if you cannot find one for your website.
- Please review our Web Snapshots Overview document for more information.
- You must whitelist the three IP addresses we use (all AWS) for Web Snapshots:
- 52.23.29.34
- 54.235.88.205
- 54.84.64.101
- Should your IT team prefer to whitelist a user agent, they can create a filter that allows user agents which includes the text “ASWebsnapshotsUserAgent“. The other information included in our user agent is dynamic and will change as we upgrade the browser we are using to capture the websites.
- The sitemap and URL list must match your website’s domain.
- If you use Google Analytics, you must filter out our Web Snapshots traffic.
Adding Your Website Sitemap For Capture.
- Log into your archive
- Navigate to the Configure tab
- To the left choose Web Snapshots
- If this is your first time setting up Web Snapshots you will need to choose an Account Owner
- When the page reloads click on the Add Sitemap button
- Fill out the Add Sitemap Menu
Note: If you do not have a sitemap.xml or sitemap_index.xml defined for your website, you may provide the URL for an HTML page that contains links to all the URLs that you want to snapshot. This will allow Web Snapshots to automatically detect new and changed URLs as you keep your sitemap or the sitemap index up to date.- Sitemap Name: ArchiveSocial recommends the following naming convention: City of {name} Gov. Site – XML. This will allow the agency to easily identify what sites are connected and the type of sitemap being used.
- Sitemap Format: Choose the format of the sitemap you are entering (we suggest XML)
- Sitemap URL: the full URL for the sitemap you are adding. For example: http://example.org/path/sitemap.xml
- Click Save Sitemap
Adding Specific URLs
- Navigate to the Configure tab
- Select the Web Snapshots tab
- Click Add Site URL
- Enter the full URL address for the page
- Click Save Site URL
Dynamic Option
Dynamic content is web content that changes based on the behavior, preference, and general interest of a site visitor. This content can be found on websites and in email content and is generated when a user accesses a page. The content is often personalized and what is displayed is based on the data a site has for a user and the time of access. The primary use of this content is to deliver a more positive experience for the end-user.
By default, the Dynamic option for Web Snapshots is turned off. Web Snapshots will detect changes to a page on a site using the XML sitemap and record the change in the archive once a day.
However, if Dynamic is enabled for a sitemap or URL, any widgets on the site or page (such as weather updates, a blog on the website, or a calendar of events), will be captured once daily for all pages where the widgets appear. For example, if your website has 100 pages and 12 of these pages have a dynamic widget, Web Snapshots will capture those 12 pages once per day as the widgets update.
Turning Off Specific URLs
- Navigate to the Configure tab
- Select the Web Snapshots tab
- For the URL you wish to turn off, click the gear icon under the Action section
- Toggle the Archiving switch to OFF
- Click the Save button
In Article Glossary
- .XML: Extensible Markup Language
- AWS: Amazon Web Services
- XML Sitemap: An XML sitemap is a file that lists a website's essential pages, making sure Google can find and crawl them all. It also helps search engines understand your website structure.
- Whitelist: A whitelist is a list or register of entities that are being provided a particular privilege, service, mobility, access, or recognition. Entities on the list will be accepted, approved, and/or recognized.
Comments
Let us know what was helpful or not helpful about the article.0 comments
Please sign in to leave a comment.