Advertisement

A pattern-matched Twitter analysis of US cancer-patient sentiments

  • Author Footnotes
    1 Present address: Oregon Health and Science University; 3181 SW Sam Jackson Park RD; Portland, OR 97239.
    W. Christian Crannell
    Footnotes
    1 Present address: Oregon Health and Science University; 3181 SW Sam Jackson Park RD; Portland, OR 97239.
    Affiliations
    Department of Surgery, University of Vermont College of Medicine, Burlington, Vermont
    Search for articles by this author
  • Eric Clark
    Affiliations
    Department of Surgery, University of Vermont College of Medicine, Burlington, Vermont

    Department of Mathematics and Statistics, University of Vermont, College of Engineering and Mathematical Sciences, Burlington, Vermont
    Search for articles by this author
  • Chris Jones
    Affiliations
    Department of Surgery, University of Vermont College of Medicine, Burlington, Vermont
    Search for articles by this author
  • Ted A. James
    Affiliations
    Department of Surgery, University of Vermont College of Medicine, Burlington, Vermont
    Search for articles by this author
  • Jesse Moore
    Correspondence
    Corresponding author. Department of Surgery, University of Vermont, UVM Medical Center, 89 Beaumont Ave, Given Building, Burlington, VT 05405. Tel.: +1 802 656 3158; fax: +1 802 656 5886.
    Affiliations
    Department of Surgery, University of Vermont College of Medicine, Burlington, Vermont
    Search for articles by this author
  • Author Footnotes
    1 Present address: Oregon Health and Science University; 3181 SW Sam Jackson Park RD; Portland, OR 97239.

      Abstract

      Background

      Twitter has been recognized as an important source of organic sentiment and opinion. This study aimed to (1) characterize the content of tweets authored by the United States cancer patients; and (2) use patient tweets to compute the average happiness of cancer patients for each cancer diagnosis.

      Methods

      A large sample of English tweets from March 2014 through December 2014 was obtained from Twitter. Using regular expression software pattern matching, the tweets were filtered by cancer diagnosis. For each cancer-specific tweetset, individual patients were extracted, and the content of the tweet was categorized. The patients' Twitter identification numbers were used to gather all tweets for each patient, and happiness values for patient tweets were calculated using a quantitative hedonometric analysis.

      Results

      The most frequently tweeted cancers were breast (n = 15,421, 11% of total cancer tweets), lung (n = 2928, 2.0%), prostate (n = 1036, 0.7%), and colorectal (n = 773, 0.5%). Patient tweets pertained to the treatment course (n = 73, 26%), diagnosis (n = 65, 23%), and then surgery and/or biopsy (n = 42, 15%). Computed happiness values for each cancer diagnosis revealed higher average happiness values for thyroid (h_avg = 6.1625), breast (h_avg = 6.1485), and lymphoma (h_avg = 6.0977) cancers and lower average happiness values for pancreatic (h_avg = 5.8766), lung (h_avg = 5.8733), and kidney (h_avg = 5.8464) cancers.

      Conclusions

      The study confirms that patients are expressing themselves openly on social media about their illness and that unique cancer diagnoses are correlated with varying degrees of happiness. Twitter can be employed as a tool to identify patient needs and as a means to gauge the cancer patient experience.

      Keywords

      Introduction

      Twitter (www.twitter.com) is a well-known online microblogging social media device that currently has 320 million monthly active members. The service allows for the users to send small messages called “tweets” that are limited to 140 characters; approximately 500 million tweets are sent per day.

      About. Twitter. [Online] 2015. [Cited: March 2, 2015.] Available at: https://about.twitter.com/company.

      The Pew Research Center, which tracks social media usage among the United States adult internet users, reported that Twitter usage has significantly increased from 18% to 23% over the past year, with a significant increase of 5%-10% in users older than 65 y.
      • Duggan M.
      • Ellison N.B.
      • Lampe C.
      • Lenhart A.
      • Madden M.
      Social media Update 2014.
      Indeed, there has been wide recognition that Twitter is a powerful gauge of public sentiment across a spectrum of current social and medical issues: the impact of socioeconomic factors on happiness,
      • Mitchell L.
      • Frank M.R.
      • Harris K.D.
      • Dodds P.S.
      • Danforth C.M.
      The geography of happiness: connecting twitter sentiment and expression, demographics, and objective characteristics of place.
      climate policies,
      • Cody E.M.
      • Reagan A.J.
      • Mitchell L.
      • Dodds P.S.
      • Danforth C.M.
      Climate change sentiment on twitter: an unsolicited public opinion poll.
      election results,
      • Caldarelli G.
      • Chessa A.
      • Pammolli F.
      • et al.
      A multi-level geographical study of Italian political elections from Twitter data.
      • Borondo J.
      • Morales A.J.
      • Losada J.C.
      • Benito R.M.
      Characterizing and modeling an electoral campaign in the context of Twitter: 2011 Spanish Presidential election as a case study.
      opioid abuse,
      • Chan B.
      • Lopez A.
      • Sarkar U.
      The canary in the coal mine tweets: social media reveals public perceptions of non-medical use of opioids.
      understanding public perception of immunizations,
      • Love B.
      • Himelboim I.
      • Holton A.
      • Stewart K.
      Twitter as a source of vaccination information: content drivers and what they are saying.
      predicting enrollment in Affordable Care Act marketplaces,
      • Wong C.A.
      • Sap M.
      • Schwartz A.
      • et al.
      Twitter sentiment predicts Affordable Care Act marketplace enrollment.
      perceptions of e-cigarettes,
      • Cole-Lewis H.
      • Pugatch J.
      • Sanders A.
      • et al.
      Social listening: a content analysis of e-cigarette discussions on twitter.
      trending infectious disease,
      • Lampos V.
      • Cristianini N.
      • Signorini A.
      • Segre A.M.
      • Polgreen P.M.
      The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic.
      • Goff D.A.
      • Kullar R.
      • Newland J.G.
      Review of twitter for infectious diseases clinicians: useful or a waste of time?.
      obesity, and allergies.
      • Michael J.P.
      • Dredze M.
      You are what you Tweet: Analyzing Twitter for public health.
      Diez et al.
      • De la Torre-Díez I.
      • Díaz-Pernas F.J.
      • Antón-Rodríguez M.
      A content analysis of chronic diseases social groups on Facebook and Twitter.
      sought to qualitatively characterize the content of breast cancer, colorectal cancer, and diabetes social groups on Facebook and Twitter. Twitter has also been used in the cancer arena to better understand breast cancer awareness month
      • Rosemary T.
      • Burton S.
      • Giraud-Carrier C.
      • Rollins S.
      • Draper C.
      Using Twitter for breast cancer prevention: an analysis of breast cancer awareness month.
      and to qualitatively categorize cervical and breast cancer screening patient dialog.
      • Lyles C.R.
      • López A.
      • Pasick R.
      • Sarkar U.
      “5 mins of uncomfyness is better than dealing with cancer 4 a lifetime”: an exploratory qualitative analysis of cervical and breast cancer screening dialogue on Twitter.
      At last, researchers have begun to understand the interconnectedness of cancer patients on Twitter and have sought to characterize those relationships.
      • Sugawara Y.
      • Narimatsu H.
      • Hozawa A.
      • Shao L.
      • Otani K.
      • Fukao A.
      Cancer patients on Twitter: a novel patient community on social media.
      As patients increasingly turn to social media to express themselves about health care concerns, we sought to test the twittersphere as a potential means by which to collect and describe the content of patient tweets and to analyze patients' health sentiments with respect to the leading cancer diagnoses as documented by the National Cancer Institute.
      American Cancer Society
      Cancer facts & figures 2015.
      We hypothesized that the most prevalent cancers would be the most frequently tweeted and that patient happiness values would vary for each cancer diagnosis.

      Methods

      A large sample of English tweets from March 2014 to December 2014 with imbedded location coordinates (“geotagged”) were obtained from Twitter's streaming application programming interface. Pattern matching using “cancer” as a keyword returned 186,406 tweets. Using regular expression software (Perl), case insensitive pattern matching along with tokenization algorithms to strip punctuation, relevant cancer-related tweets were filtered from the data stream. Tweets from countries other than the United States, spam and other unrelated content were excluded by filtering the data set. A total of 25,103 tweets were removed from the data set, resulting in a clean data set of 146,357 tweets from the United States. From the cleaned tweetset, regular expression software was used to extract tweets using terms relevant to specific cancer diagnoses (see Table 1) for the highest incidence cancers within the United States (breast, prostate, lung, colon and rectal, melanoma, bladder, non-Hodgkin lymphoma, kidney, thyroid, endometrial, leukemia, and pancreatic).
      American Cancer Society
      Cancer facts & figures 2015.
      Within each cancer-specific tweetset, individual patients with active and remote disease were extracted manually, resulting in a cancer-patient tweetset. The contents of the patients' tweets for each cancer diagnosis were qualitatively categorized. For all cancer-patients manually extracted, the patients' Twitter identification (ID) numbers were used to obtain all tweets for each ID number (thus including tweets not necessarily pertaining to cancer); this was done for each unique cancer diagnosis for the same March-December 2014 time frame. To investigate the relative sentiment (happiness) of cancer patient tweets, a quantitative hedonometric analysis using the Language Assessment by Mechanical Turk (LabMT) word list
      • Dodds P.S.
      • Harris K.D.
      • Kloumann I.M.
      • Bliss C.A.
      • Danforth C.M.
      Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter.
      was performed on the patient tweetsets for each cancer diagnosis. LabMT is a word happiness list of the most frequently occurring 10,222 English words compiled through frequency distributions from Google Books, the New York Times (1987-2007), music lyrics (1960-2007), and Twitter. To estimate the numerical average happiness (h_avg) of each word, words were scored on a 1-9 “happiness scale” using the popular online survey service Amazon Mechanical Turk. The happiest word is “laughter” (h_avg = 8.50), and the saddest word is “terrorist” (h_avg = 1.30).
      • Dodds P.S.
      • Harris K.D.
      • Kloumann I.M.
      • Bliss C.A.
      • Danforth C.M.
      Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter.
      The hedonometric analysis computes the average happiness value of a patient tweetset by tallying the appearance of LabMT words found in the tweetsets. The average happiness value for a tweetset is thus a weighted arithmetic mean of each word's frequency and the word's corresponding average happiness score. To increase the emotional signal, neutral “stop words” (4 ≤ h_avg ≤ 6) are removed from the analysis. Word-shift graphs illustrate emotional word frequency distributions (e.g., Fig. 2 and Fig. 3) and were used to characterize the relative sentiment differences between cancer types and to also compare all cancer patient tweets relative to background tweets. The background tweet reference period was a random sampling of other geotagged tweets from the same March-December time frame.
      Table 1For each National Cancer Institute (NCI) cancer diagnosis, filter terms were used to extract cancer-specific tweets.
      NCI diagnosisFilter termsTweets (n)/fraction of total
      Total number of diagnosis-specific tweets = 21,743.
      (%)
      Patients (n)/fraction of diagnosis tweets (%)
      Breast“Breast cancer”15,421/70.9161/1.0
      Lung“Lung cancer”2928/13.537/1.3
      Prostate“Prostate cancer”1036/4.815/1.4
      Colon, colorectal, rectal“Colon cancer,” “colorectal,” “rectal cancer”773/3.624/3.1
      Pancreas“Pancreas cancer,” “pancreatic cancer”673/3.16/0.9
      Thyroid“Thyroid cancer”195/0.923/11.8
      Lymphoma“Lymphoma”180/0.822/12.2
      Leukemia“Leukemia”177/0.810/5.6
      Melanoma“Melanoma”141/0.614/9.9
      Bladder“Bladder cancer”93/0.43/3.2
      Kidney“Clear cell,” “kidney cancer,” “renal cell carcinoma,” “renal cancer”83/0.48/9.6
      Endometrial“Uterine cancer,” “endometrial cancer”43/0.210/23.3
      These tweets were then manually filtered to select patients.
      Total number of diagnosis-specific tweets = 21,743.
      Figure thumbnail gr1
      Fig. 1Cancer patient tweet content categorization by cancer type. Most frequently tweeted subject matter for each unique cancer diagnosis. Content totals across all cancer diagnoses are shown in parentheses. (Color version of figure is available online.)
      Figure thumbnail gr2
      Fig. 2Word-shift graphs for the happiest cancer patient tweetset and the saddest cancer patient tweetset. Thyroid cancer patient tweets (T_comp thyroid) have a higher computed average happiness value (h_avg = 6.16) than kidney cancer patients (T_comp kidney; h_avg = 5.85) when these cancer patient tweetsets are compared to a reference (T_ref). The top 50 words responsible for the happiness shift between the two cancer types are displayed, along with their contribution to shifting the average happiness of the tweetset. The higher happiness score of the thyroid patient tweetset is driven by increased frequency of positive words such as “blessed,” “thank,” and “love” and decreased frequency of negative words such as expletives, “not,” “no,” and “lost.” In contrast, the kidney cancer patient tweetset shows increased frequency of negative words such as expletives, “mad,” and “can't,” as well as decreased usage of positive words such as “happy” and “lol.” The arrows (up, down) indicate an increase or decrease, respectively, of the word's frequency in the cancer tweetset relative to the reference period. The addition and subtraction signs indicate if the word contributes positively or negatively, respectively, to the average happiness score. The tweetset size is noted by the gray boxes; the balance represents the relative shifts in word types. The reference (T_ref) is the average of all the other cancer patient tweetsets minus the comparison set. (Color version of figure is available online.)
      Figure thumbnail gr3
      Fig. 3Word-shift analysis comparing total cancer patient tweets versus control tweets. Cancer patient tweets (T_comp) have a higher computed average happiness value (h_avg = 6.16) compared to reference control (T_ref; 5.99). The top 50 words responsible for a happiness shift between the two periods are displayed, along with their contribution to shifting the average happiness of the tweetset. The higher happiness score of the cancer tweetset is driven by increased frequency of positive words such as “thank,” “blessed,” and “beautiful” and decreased frequency of negative words such as expletives, “can't,” “mad,” and “never”. The arrows (up, down) indicate an increase or decrease, respectively, of the word's frequency in the cancer tweetset set relative to the reference period. The addition and subtraction signs indicate if the word contributes positively or negatively, respectively, to the average happiness score. The tweetset size is noted by the gray boxes; the balance represents the relative shifts in word types. The reference control is a random selection of geotagged tweets from the same timeframe as the comparison set. (Color version of figure is available online.)

      Results

      The most frequently tweeted cancers were breast (n = 15,421), lung (n = 2928), prostate (n = 1036), and colon and/or rectal (n = 773; Table 1). Patients were manually extracted for each unique cancer diagnosis, with a total of 161 patients for breast cancer, although this only represented a small fraction of the total tweets (1.0%). This is in contrast to endometrial cancer, where out 43 total tweets, 10 patients were identified (23.3%). Following manually extracting patients for each cancer diagnosis, the content of the patients’ tweets were qualitatively characterized. The patient tweets the most often involved subject matter pertaining to treatment course (e.g., chemotherapy, radiation, and hospital visits) (n = 73, 26%), sharing about a diagnosis (n = 65, 23%), and commenting on surgery or biopsy procedure (n = 42, 15%; Fig. 1). However, the most commonly tweeted subject matter tended to be cancer-type specific; for instance, breast cancer patients were more likely to tweet about their treatment course (n = 35), than sharing about their diagnosis (n = 26), in contrast to prostate cancer patients, who were more likely to share about their diagnosis (n = 9) than their treatment course (n = 2; Fig. 1).
      The hedonometric analysis was performed on all available tweets for each patient within each cancer diagnosis (Table 2). When examining the cancer-specific tweetsets, the average computed happiness value was the greatest for thyroid (h_avg = 6.1625), breast (h_avg = 6.1485), and lymphoma (h_avg = 6.0977) cancers, whereas the average computed sentiment score was the lowest for pancreatic (h_avg = 5.8766), lung (h_avg = 5.8733), and kidney (h_avg = 5.8464) cancers (Table 2). The happiness value for each respective cancer diagnosis was driven by specific word shifts inherent to that diagnosis, when that cancer diagnosis was compared to all others (Table 2, Fig. 2, and Supplemental information). The summed cancer patient tweetset had a higher computed sentiment value (h_avg = 6.16) than a matched control (h_avg = 5.99) from the same time period (Fig. 3).
      Table 2Computed average word happiness value (h_avg) for each cancer diagnosis and summary of major word shifts responsible for sentiment value.
      Cancer typeTweetset (n)h_avgIncreased frequency words
      See supplemental information for word shift figures.
      Decreased frequency words
      See supplemental information for word shift figures.
      Thyroid56736.1625Blessed,” “thank,” “Christmas,” “loveExpletives, “not,” “no,” “lost,” “die
      Breast72,5286.1485Happy,” “love,” “welcomeExpletives, “hate,” “never
      Lymphoma51436.0977God,” “win,” “photo,” “proud,” “missNot,” “don't,” “happy
      Endometrial49396.0913Love,” “sorry,” “surgery,” “painExpletives, “hate,” “don't
      Bladder15796.0843Good,” “great,” “winLove,” “don't,” “hate
      Melanoma13,4186.0611Love,” “bloody,” “hellHappy,” “great,” “good
      Prostate16,1616.0223Good,” “great,” “nice,” expletivesLove,” “happy
      Colorectal96826.0149Lol,” “good,” “not,” “no,” “hellHappy,” “love,” “beautiful,” “welcome
      Leukemia60425.9730Smoke,” “hate,” “bored,” “hahaHappy,” “beautiful
      Pancreas51175.8766Expletives, “don't,” “badHappy,” “great,” “thanks
      Lung38,3795.8733Expletives, “don't,” “hate,” “meanLove,” “happy,” “great,” “thanks
      Kidney72455.8464Expletives, “don't,” “hospital,” “surgeryHappy,” “lol,” “thank
      Positive sentiment words are displayed in bold, whereas negative sentiment words are displayed in italics. In general, as the h_avg increases, the data set contains increased frequency of positive words and decreased frequency of negative words.
      See supplemental information for word shift figures.

      Discussion

      This study investigated the most commonly tweeted cancers and identified patients with active disease, as well as those in remission. The fact that the breast cancer was the top-tweeted cancer was not surprising, considering breast cancer is one of the most prevalent cancer types,
      American Cancer Society
      Cancer facts & figures 2015.
      the large public awareness surrounding the disease, and the highly publicized and endorsed October breast cancer awareness month. The national incidence of lung, prostate, and colorectal cancer is most likely reflected in the relative frequency of tweets pertaining to these diseases. When manually extracting patients from the cancer-specific tweetset, patients were more likely to be identified depending on the cancer type. For instance, only 1% of all breast cancer tweets were patient tweets, most likely representing the fact that nondiagnosed individuals are tweeting about the disease for fundraising purposes or sharing feelings about a loved one with the disease. This is in contrast to endometrial cancer, where 23% of the disease tweets were patients, potentially representing that either patients with endometrial cancer are more likely to tweet than other cancer patients or there is less public awareness surrounding the diagnosis, and thus, there are fewer nonpatient tweets. Overall, nonpatient tweets for all cancer diagnosis were largely individuals sharing about a loved one with the diagnosis. Further investigation into this area of patient tweets with respect to cancer type is warranted, particularly with cancer types that are more likely to affect younger patients, who may be more likely to tweet about their disease.
      When studying the patient tweetset, the overall theme was patients sharing about their treatment course, which could entail describing undergoing chemotherapy or radiation, being in the hospital, or returning for a follow-up visit. When the themes are differentiated based on cancer type, however, variations emerge that most likely reflect the nature and natural history of the diagnosis. For instance, only endometrial cancer patients shared about reproductive concerns, and breast cancer patients were very likely to share about their biopsy and/or surgery. This fact highlights that Twitter could be used to track patient's subjective experience of undergoing treatment for cancer in real-time, allowing for identification of unmet patient needs.
      The study also examined the happiness values of cancer patients for each diagnosis. By using the patients' Twitter ID numbers, all tweets for the users were obtained including noncancer-related tweets. This method allowed for analysis of the patients' sentiments beyond their cancer diagnoses. Predictably, cancer tweetsets with higher happiness values used more positive words and fewer negative words; the opposite is true for those tweetsets with lower happiness values (Table 2 and Fig. 2). Since the happiness values were computed using all tweets from the patients, not just cancer-related tweets, it is apparent that the cancer diagnosis permeates through patients' lives and reinforces the disease-illness dichotomy. This idea represents the notion that the “disease” is the pathologic diagnosis, whereas the “illness” refers to how each individual responds, copes, and manages the disease in daily life.
      • Helman C.G.
      Disease versus illness in general practice.
      The fact that variations are observed between cancer types, with thyroid cancer patients having the highest average happiness value and kidney cancer patients having the lowest happiness value (Fig. 2), demonstrates that the natural history inherent to the cancer type may affect patients in unique ways. The differences in happiness may perhaps be explained by the prognosis of the cancer type, as thyroid cancer (97.8%, 5-y survival) and breast cancer (89.2%, 5-y survival) are generally amenable to treatment with favorable outcomes, whereas pancreatic (6.7%, 5-y survival) and lung (16.8%, 5-y survival) cancer carry a worse prognosis.
      American Cancer Society
      Cancer facts & figures 2015.
      However, the low happiness value of kidney cancer does not conform to the trend (72%, 5-y survival),
      American Cancer Society
      Cancer facts & figures 2015.
      and the value may represent morbidity associated with the treatment or the natural history of the disease. For instance, kidney cancer patients more often use the negative words “surgery” and “hospital.” Likewise, endometrial cancer patients use the terms “surgery” and “pain” more frequently. Through the differences in computed happiness values, it appears that the specific cancer diagnosis, not just the mere fact of having “cancer,” affects patients throughout their lives and patients manifest their illness through social media. At the moment, these results are correlative, and whether certain cancer patients are indeed happier than others requires more thorough investigation.
      When comparing all cancer patient tweets relative to a random time-matched control of other geotagged tweets, the cancer patient tweetset had a higher computed happiness value (h_avg = 6.16 versus 5.99; Fig. 3). This was driven by a combination of increased positive words and decreased use of negative words. This trend may be explained by surviving cancer patients having a unique perspective on life and being more thankful for being cancer-free. This conclusion is supported by cancer patients' tweets containing increased frequency of words such as “thank,” “blessed,” “beautiful,” and “love”. Cancer patients may also be more grateful and appreciative of family members and others, evidenced by increased use of the word “family” and a dramatic decrease in “me,” implying that cancer patients may be more likely to look outward and be less self-centered. The cancer patient tweetset had less frequency of negative words such as “don't,” “ain't,” and “can't,” signifying cancer patients may feel more empowered with themselves, reinforcing the “survivor” theme that is so often found within cancer support groups. Although negatively contributing to the happiness value, cancer patients were more apt to use the word “pain.” Cancer-related pain is a difficult symptom to manage, evidenced by the World Health Organization pain ladder, and the increasing dialog and research surrounding palliative care
      • Zimmermann C.
      • Swami N.
      • Krzyzanowska M.
      • et al.
      Early palliative care for patients with advanced cancer: a cluster-randomised controlled trial.
      • Partridge A.H.
      • Seah D.
      • King T.
      • et al.
      Developing a service model that integrates palliative care throughout cancer care: the time is now.
      and medical marijuana.
      • Carter G.T.
      • Flanagan A.M.
      • Earleywine M.
      • Abrams D.I.
      • Aggarwal S.K.
      • Grinspoon L.
      Cannabis in palliative medicine: improving care and reducing opioid-related morbidity.
      • Bowles D.W.
      • O'Bryant C.L.
      • Camidge D.R.
      • Jimeno A.
      The intersection between cannabis and cancer in the United States.
      This study highlights that pain is certainly in the cancer patient dialog, as “pain” is in the top 10 word rank. Since the study includes patients with both active and remote disease, it is unclear if pain is persisting throughout the cancer trajectory or just in patients with active disease. Further research can help to delineate this and address whether the medical community is acting aggressively enough to address the pain needs of cancer patients.
      Overall, the word shifts of the cancer patient population have provided quantitative insight to the subjective experience of patients laying the foundation for further investigation into discovering unmet patient needs, such as pain management, or to reinforce and bolster the highlighted positive sentiments within support communities.
      Limitations of this study include a sample bias in that the original filter term was for “cancer.” This is problematic for diseases, such as melanoma, leukemia, and lymphoma, where patients may not include the term “cancer” in their tweet; thus, these diseases are likely underrepresented in our data set. Included in the patient tweetset were patients with both active and remote disease, and thus for diseases with a greater prevalence (such as breast cancer), the patient population is more likely to be survivors than for those diseases with a greater mortality. This fact may skew the data and the happiness values; yet, it is also important to examine patients with active and remote disease simultaneously, as in this study, to better appreciate the caner timeline trajectory, as most cancers are slowly morphing into chronic disease. To delineate between patients with active disease and those with remote disease, further studies are warranted. Furthermore, the tweetsets used in the study were obtained from the geotagged database, representing only 1% of all total tweets. Due to the relatively small number of extracted patients in each cancer category, one cannot determine if the differences observed in happiness index between difference cancer types approach statistical significance. A study incorporating tweets from a larger patient population (such as the entire Twitter database) over a greater time period would be required to make this determination. By including tweets over a whole calendar year, the happiness values may shift given the prevalence of seasonal affective disorder; yet, as the prevalence of seasonal affective disorder is known to vary with latitude
      • Roecklein K.A.
      • Rohan K.J.
      Seasonal affective disorder: an overview and update.
      and this being a national study, the happiness differences would most likely be subtle. Our present study provides a sound basis for performing this type of Twitter analysis, since before our findings the presence and pattern of cancer patient tweets was not known. From this small sample of tweets, the data demonstrate that patients are indeed tweeting about health care concerns, and when taken to a larger data set, there is much that can be researched about patient concerns, sentiments, and discovering unmet patient needs, particularly for those patients with rarer cancers.

      Conclusions

      The most frequently tweeted cancers are breast, lung, and prostate cancer, and the most common theme of the tweets is sharing about treatment course. A hedonometric analysis of cancer-patient tweets demonstrated interdiagnosis variability, confirming the inherent natural history of the disease affects patient sentiments in unique ways. This preliminary study shows that patients do broadcast their illness through social media and that Twitter can and should be used as a source to gauge patient satisfaction and to discover unmet patient needs. The methodology described herein provides a foundation on which to explore additional aspects of the cancer patient experience and to investigate other areas of health care delivery to better understand how the disease process and medical care interventions affect patients.

      Acknowledgment

      This work was supported in part by the National Institutes of Health (NIH) Research Awards R01DA014028 & R01HD075669 and by Center of Biomedical Research Award P20GM103644 from the National Institute of General Medical Sciences to C.J.
      Authors' contributions: W.C.C. authored the manuscript, performed the pattern matching, patient extraction, and tweet categorization. E.C. collected the Twitter data, performed the hedonometric analysis, created the word-shift graphs, and provided technical support. J.M., C.J., and T.A.J. provided supervision with respect to study design, administrative support, and critical revision of the manuscript draft. W.C.C. and E.C. had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. All authors approved the manuscript. The authors have no conflicts of interest to report.

      Disclosure

      The authors reported no proprietary or commercial interest in any product mentioned or concept discussed in the article.

      Supplementary data

      References

      1. About. Twitter. [Online] 2015. [Cited: March 2, 2015.] Available at: https://about.twitter.com/company.

        • Duggan M.
        • Ellison N.B.
        • Lampe C.
        • Lenhart A.
        • Madden M.
        Social media Update 2014.
        Pew Research Center, Washington DC2015 (Available at: http://www.pewinternet.org/2015/01/09/social-media-update-2014/)
        • Mitchell L.
        • Frank M.R.
        • Harris K.D.
        • Dodds P.S.
        • Danforth C.M.
        The geography of happiness: connecting twitter sentiment and expression, demographics, and objective characteristics of place.
        PLoS One. 2013; 8: e64417
        • Cody E.M.
        • Reagan A.J.
        • Mitchell L.
        • Dodds P.S.
        • Danforth C.M.
        Climate change sentiment on twitter: an unsolicited public opinion poll.
        PLoS One. 2015; 10: e0136092
        • Caldarelli G.
        • Chessa A.
        • Pammolli F.
        • et al.
        A multi-level geographical study of Italian political elections from Twitter data.
        PLoS One. 2014; 9: e95809
        • Borondo J.
        • Morales A.J.
        • Losada J.C.
        • Benito R.M.
        Characterizing and modeling an electoral campaign in the context of Twitter: 2011 Spanish Presidential election as a case study.
        Chaos. 2012; 22: 023138
        • Chan B.
        • Lopez A.
        • Sarkar U.
        The canary in the coal mine tweets: social media reveals public perceptions of non-medical use of opioids.
        PLoS One. 2015; 10: e0135072
        • Love B.
        • Himelboim I.
        • Holton A.
        • Stewart K.
        Twitter as a source of vaccination information: content drivers and what they are saying.
        Am J Infect Control. 2013; 41: 568-570
        • Wong C.A.
        • Sap M.
        • Schwartz A.
        • et al.
        Twitter sentiment predicts Affordable Care Act marketplace enrollment.
        J Med Internet Res. 2015; 17: e51
        • Cole-Lewis H.
        • Pugatch J.
        • Sanders A.
        • et al.
        Social listening: a content analysis of e-cigarette discussions on twitter.
        J Med Internet Res. 2015; 17: e243
        • Lampos V.
        • Cristianini N.
        Tracking the flu pandemic by monitoring the social web. Cognitive Information Processing (CIP), 2010 2nd International Workshop. 2010: 411-416
        • Signorini A.
        • Segre A.M.
        • Polgreen P.M.
        The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic.
        PLoS One. 2011; 6: e19467
        • Goff D.A.
        • Kullar R.
        • Newland J.G.
        Review of twitter for infectious diseases clinicians: useful or a waste of time?.
        Clin Infect Dis. 2015; 60: 1533-1540
        • Michael J.P.
        • Dredze M.
        You are what you Tweet: Analyzing Twitter for public health.
        International Association for the Advancement of Artificial Intelligence Conference on Web and Social Media, Barcelona, Spain2011
        • De la Torre-Díez I.
        • Díaz-Pernas F.J.
        • Antón-Rodríguez M.
        A content analysis of chronic diseases social groups on Facebook and Twitter.
        Telemed J E Health. 2012; 18: 404-408
        • Rosemary T.
        • Burton S.
        • Giraud-Carrier C.
        • Rollins S.
        • Draper C.
        Using Twitter for breast cancer prevention: an analysis of breast cancer awareness month.
        BMC Cancer. 2013; 13: 508
        • Lyles C.R.
        • López A.
        • Pasick R.
        • Sarkar U.
        “5 mins of uncomfyness is better than dealing with cancer 4 a lifetime”: an exploratory qualitative analysis of cervical and breast cancer screening dialogue on Twitter.
        J Cancer Educ. 2013; 28: 127-133
        • Sugawara Y.
        • Narimatsu H.
        • Hozawa A.
        • Shao L.
        • Otani K.
        • Fukao A.
        Cancer patients on Twitter: a novel patient community on social media.
        BMC Res Notes. 2012; 5: 699
        • American Cancer Society
        Cancer facts & figures 2015.
        American Cancer Society, Atlanta2015
        • Dodds P.S.
        • Harris K.D.
        • Kloumann I.M.
        • Bliss C.A.
        • Danforth C.M.
        Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter.
        PLoS One. 2011; 6: e26752
        • Helman C.G.
        Disease versus illness in general practice.
        J R Coll Gen Pract. 1981; 31: 548-552
        • Zimmermann C.
        • Swami N.
        • Krzyzanowska M.
        • et al.
        Early palliative care for patients with advanced cancer: a cluster-randomised controlled trial.
        Lancet. 2014; 383: 1721-1730
        • Partridge A.H.
        • Seah D.
        • King T.
        • et al.
        Developing a service model that integrates palliative care throughout cancer care: the time is now.
        J Clin Oncol. 2014; 32: 3330-3336
        • Carter G.T.
        • Flanagan A.M.
        • Earleywine M.
        • Abrams D.I.
        • Aggarwal S.K.
        • Grinspoon L.
        Cannabis in palliative medicine: improving care and reducing opioid-related morbidity.
        Am J Hosp Palliat Care. 2011; 28: 297-303
        • Bowles D.W.
        • O'Bryant C.L.
        • Camidge D.R.
        • Jimeno A.
        The intersection between cannabis and cancer in the United States.
        Crit Rev Oncol Hematol. 2012; 83: 1-10
        • Roecklein K.A.
        • Rohan K.J.
        Seasonal affective disorder: an overview and update.
        Psychiatry (Edgmont). 2005; 2: 20-26

      Linked Article

      • Twitter as a survey tool for real-time unbiased snapshots of personal sentiment in population level
        Journal of Surgical ResearchVol. 206Issue 2
        • Preview
          Social media are internet-based applications that allow people to create, share, or exchange user-generated information in texts, pictures, or videos in the internet. According to Pew Internet Research, American adults who use at least one social media have increased from 7% in 2005 to 65% in 2015. This statistic was confirmed by other that reported 67% of US citizens aged 12 years and older use some kind of social media.1 This means that two-thirds of people in the United States are using social media as one of their communication tools at this point.
        • Full-Text
        • PDF