Fundación Universitaria Konrad Lorenz
Docente: Viviana Márquez vivianam.penama@konradlorenz.edu.co
Clase #13: Mayo 20, 2021
Utilizan puntuaciones pre-definidas para cada palabra y determinar el resultado de una frase sacando el promedio. (También son capaces de detectar modificadores).
😭 Limitación: Funciona sólo en inglés
pip install vaderSentiment
pip install textblob
ejemplo1 = "At least it isn't a horrible book."
ejemplo2 = "Make sure you :) or :D today!"
ejemplo3 = "Goku is very funny"
ejemplo4 = "Goku is SO FUNNY!!!!!"
ejemplo5 = "Bogota has a metro"
from textblob import TextBlob
TextBlob(ejemplo3).sentiment
Sentiment(polarity=0.325, subjectivity=1.0)
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
vader = SentimentIntensityAnalyzer()
vader.polarity_scores("It was a horrible hotel but at least the employees were kind")
{'neg': 0.134, 'neu': 0.593, 'pos': 0.273, 'compound': 0.5187}
import pandas as pd
data = pd.read_csv("../archivos/machine_learning.csv")
data.head()
tweet | date | link | |
---|---|---|---|
0 | If you've ever worried about facial recognition, you're not alone. https://t.co/RvZrwxMzPH | 2021-05-20 15:07:58 | https://twitter.com/i/web/status/1395395883814907904 |
1 | ostdoc on Proton Structure Studies with Machine Learning (Theory) — AcademicTransfer https://t.co/Iwz0cPVieS #ai #ml #dl | 2021-05-20 15:07:50 | https://twitter.com/i/web/status/1395395850663170053 |
2 | Is Artificial Intelligence/Machine Learning Real? \n\nAnother strong program 1 PM today online by Hudson Valley Direct Marketing Association https://t.co/knNT9pKcvq\ncbsi Services Inc. \n#AI #ArtificialInteligence #digitalmarketing #machinelearning #datascience #dataanalytics | 2021-05-20 15:07:44 | https://twitter.com/i/web/status/1395395823220006915 |
3 | End-to-End IoT analytics and machine learning with Azure Data and AI services https://t.co/ooJt94cvZ7 | 2021-05-20 15:06:46 | https://twitter.com/i/web/status/1395395582819282946 |
4 | Graph algorithms combined with machine learning offer a more modern and intelligent approach in fighting, monitoring, and investigating illicit activity. Learn more next week with @AmerBanker.\n\nhttps://t.co/ZTYcVkeT9j\n\n#banking #fintech #fraud #finance #graphanalytics #graphDB | 2021-05-20 15:06:30 | https://twitter.com/i/web/status/1395395514070441986 |
data['sentimiento_vader'] = data['tweet'].apply(lambda valor: vader.polarity_scores(valor)['compound'])
data['sentimiento_textblob'] = data['tweet'].apply(lambda valor: TextBlob(valor).sentiment.polarity)
data.head()
tweet | date | link | sentimiento_vader | sentimiento_textblob | |
---|---|---|---|---|---|
0 | If you've ever worried about facial recognition, you're not alone. https://t.co/RvZrwxMzPH | 2021-05-20 15:07:58 | https://twitter.com/i/web/status/1395395883814907904 | -0.1179 | 0.000000 |
1 | ostdoc on Proton Structure Studies with Machine Learning (Theory) — AcademicTransfer https://t.co/Iwz0cPVieS #ai #ml #dl | 2021-05-20 15:07:50 | https://twitter.com/i/web/status/1395395850663170053 | 0.0000 | 0.000000 |
2 | Is Artificial Intelligence/Machine Learning Real? \n\nAnother strong program 1 PM today online by Hudson Valley Direct Marketing Association https://t.co/knNT9pKcvq\ncbsi Services Inc. \n#AI #ArtificialInteligence #digitalmarketing #machinelearning #datascience #dataanalytics | 2021-05-20 15:07:44 | https://twitter.com/i/web/status/1395395823220006915 | 0.5106 | 0.033333 |
3 | End-to-End IoT analytics and machine learning with Azure Data and AI services https://t.co/ooJt94cvZ7 | 2021-05-20 15:06:46 | https://twitter.com/i/web/status/1395395582819282946 | 0.0000 | 0.000000 |
4 | Graph algorithms combined with machine learning offer a more modern and intelligent approach in fighting, monitoring, and investigating illicit activity. Learn more next week with @AmerBanker.\n\nhttps://t.co/ZTYcVkeT9j\n\n#banking #fintech #fraud #finance #graphanalytics #graphDB | 2021-05-20 15:06:30 | https://twitter.com/i/web/status/1395395514070441986 | -0.4654 | 0.400000 |
pd.set_option('display.max_colwidth', None)
data[['tweet', 'sentimiento_vader']].sort_values('sentimiento_vader')
tweet | sentimiento_vader | |
---|---|---|
72 | Got an INSANE antivirus yesterday it’s government based and is built on AI and machine learning literally stops a virus before it can even touch my computer along with learning about the code in it to fight things that could be similar crazy af | -0.8415 |
40 | 🚨 NEW: CVE-2021-29581 🚨 TensorFlow is an end-to-end open source platform for machine learning. Due to lack of validation in `tf.raw_ops.CTCBeamSearchDecoder`, an attacker can trigger denial of service via segmentat... (click for more) Severity: MEDIUM https://t.co/EHVlvaeNSA | -0.7184 |
36 | @Money_Reasons @nytimes As for history, you hardly need a time machine to see the disparity. The amount of scandals and atrocities present in the last 100 years of US history that go unmentioned in history classes are enumerable. It’s unsurprising that many have issues with US policy after learning this | -0.4939 |
4 | Graph algorithms combined with machine learning offer a more modern and intelligent approach in fighting, monitoring, and investigating illicit activity. Learn more next week with @AmerBanker.\n\nhttps://t.co/ZTYcVkeT9j\n\n#banking #fintech #fraud #finance #graphanalytics #graphDB | -0.4654 |
44 | [En direct]🔥 \nLe #live Machine learning & Deep learning c'est maintenant sur le canal #twitch de Tera Campus ! \n\nhttps://t.co/fURuKuDnn0\n\n#streamer #Livestream #streaming #LiveConf #livestreaming #IntelligenceArtificielle https://t.co/x42hzCWPd4 | -0.4003 |
... | ... | ... |
7 | Call AI conversational intelligence leverages AI & machine-learning to incorporate key capabilities that maximize #sales potential to win deals. Features include call recording & transcripts, call sharing & collaboration, & CRM integrations. Learn more 👉 https://t.co/WIXLn2fC6G | 0.8658 |
77 | Everything awesome ends quickly - that is also true for the 5th edition of the NDSML Summit. During the past 3 days, we've heard some amazing case studies, strategies and approaches in Data Science and Machine Learning. ✨\n\nThank you to everyone joining🙏\n\n#NDSMLSummit2021 https://t.co/Q2MKeKMLue | 0.9382 |
30 | Implementing new #technologies like artificial intelligence and machine learning are part of a solid business strategy. Some key advantages include increased productivity, higher rates of production, improved quality, and efficient use of materials. #MSP https://t.co/KktxtkuD3o | 0.9403 |
53 | This is a true opportunity who has an interest in Machine Learning. I'm on in too. Let's enjoy it together!\n* Thank you for giving me, a poor full-time student, a chance to take this course for free. @svpino https://t.co/viLwTe56Sp | 0.9531 |
49 | One of the rare times I look at LinkedIn, I come across this graph. IDK how it's made, but being in the field, I think it's such a good reference. Like, just have a good grasp on Python and how to use basic machine learning techniques and you can land a pretty good paying job. https://t.co/zIxwIO9vmt | 0.9614 |
100 rows × 2 columns
Estos modelos aprenden los embeddings para considerar las similitudes entre las palabras y hacerlos resistentes a los errores de ortografía.
pip install flair
https://pytorch.org/
Opción: Google Colab
from flair.models import TextClassifier
from flair.data import Sentence
classifier = TextClassifier.load('en-sentiment')
sentence = Sentence(ejemplo1)
print(ejemplo1)
classifier.predict(sentence, multi_class_prob=False)
l = sentence.labels
l[0].value, l[0].score
2021-05-20 11:43:53,558 loading file /Users/vivianamarquez/.flair/models/sentiment-en-mix-distillbert_3.1.pt At least it isn't a horrible book.
('POSITIVE', 0.8074595928192139)
classifier = TextClassifier.load('sentiment')
sentence = Sentence("Instalar Pytorch es un dolor de cabeza, lo odio")
classifier.predict(sentence)
sentence.labels
2021-05-20 12:05:59,821 loading file /Users/vivianamarquez/.flair/models/sentiment-en-mix-distillbert_3.1.pt
[NEGATIVE (0.9438)]
classifier = TextClassifier.load('sentiment')
def sent_flair(texto):
sentence = Sentence(texto)
classifier.predict(sentence)
result = sentence.labels[0]
label = result.value
score = result.score
if label == 'POSITIVE':
return score
if label == 'NEGATIVE':
return -1 * score
return score
data['sent_flair'] = data['tweet'].apply(lambda valor: sent_flair(valor))
data.head()
2021-05-20 12:06:21,805 loading file /Users/vivianamarquez/.flair/models/sentiment-en-mix-distillbert_3.1.pt
tweet | date | link | sentimiento_vader | sentimiento_textblob | sent_flair | |
---|---|---|---|---|---|---|
0 | If you've ever worried about facial recognition, you're not alone. https://t.co/RvZrwxMzPH | 2021-05-20 15:07:58 | https://twitter.com/i/web/status/1395395883814907904 | -0.1179 | 0.000000 | 0.997260 |
1 | ostdoc on Proton Structure Studies with Machine Learning (Theory) — AcademicTransfer https://t.co/Iwz0cPVieS #ai #ml #dl | 2021-05-20 15:07:50 | https://twitter.com/i/web/status/1395395850663170053 | 0.0000 | 0.000000 | 0.931684 |
2 | Is Artificial Intelligence/Machine Learning Real? \n\nAnother strong program 1 PM today online by Hudson Valley Direct Marketing Association https://t.co/knNT9pKcvq\ncbsi Services Inc. \n#AI #ArtificialInteligence #digitalmarketing #machinelearning #datascience #dataanalytics | 2021-05-20 15:07:44 | https://twitter.com/i/web/status/1395395823220006915 | 0.5106 | 0.033333 | 0.611579 |
3 | End-to-End IoT analytics and machine learning with Azure Data and AI services https://t.co/ooJt94cvZ7 | 2021-05-20 15:06:46 | https://twitter.com/i/web/status/1395395582819282946 | 0.0000 | 0.000000 | 0.997630 |
4 | Graph algorithms combined with machine learning offer a more modern and intelligent approach in fighting, monitoring, and investigating illicit activity. Learn more next week with @AmerBanker.\n\nhttps://t.co/ZTYcVkeT9j\n\n#banking #fintech #fraud #finance #graphanalytics #graphDB | 2021-05-20 15:06:30 | https://twitter.com/i/web/status/1395395514070441986 | -0.4654 | 0.400000 | 0.742857 |
data[['tweet', 'sent_flair']].sort_values('sent_flair')
tweet | sent_flair | |
---|---|---|
56 | @IamHappyToast They probably ran it through Adobe Super Resolution or some other machine learning AI upscaling software. I'd be most interested in looking at the file headers and whomever buys it is a mug. | -0.999953 |
57 | Just thinking about how when I was learning to type, different keyboards would have symbols in different places. Depending on which machine in the computer lab you got, apostrophe might be shift-7 or its own key beside return. | -0.999580 |
20 | Get Flat 96% Off on Machine Learning on Udemy \n\nStudy Now!\nhttps://t.co/ChqyvOhIZ1\n-#sridivya https://t.co/xKHmw4DQbX | -0.999491 |
32 | In this episode of Gradient Dissent, we're joined by @sAlyssaSF, Director of Product at @BlueShieldCA and co-author of "Real World AI" 💫\n\n🎥: https://t.co/vY1HLkpBhD\n✏️: https://t.co/I3LfzMukJI https://t.co/E8oDeEVqID | -0.984372 |
40 | 🚨 NEW: CVE-2021-29581 🚨 TensorFlow is an end-to-end open source platform for machine learning. Due to lack of validation in `tf.raw_ops.CTCBeamSearchDecoder`, an attacker can trigger denial of service via segmentat... (click for more) Severity: MEDIUM https://t.co/EHVlvaeNSA | -0.981582 |
... | ... | ... |
26 | If you've ever worried about facial recognition, you're not alone. https://t.co/rGCzp79GQx | 0.998753 |
35 | Today another use case. This time: #healthcare. #ML model gauges #unconsciousness in patients under #anesthesia, to help dosage of anesthetics and reducing risks for the patient😴✅\n\nLooking for data for your next #AI project?\n👉https://t.co/JTBatwrAzW\n\n🔗https://t.co/em5Ddc6vq8 | 0.998872 |
30 | Implementing new #technologies like artificial intelligence and machine learning are part of a solid business strategy. Some key advantages include increased productivity, higher rates of production, improved quality, and efficient use of materials. #MSP https://t.co/KktxtkuD3o | 0.999015 |
97 | 1. Automating processes, improving efficiencies and enabling alternative care delivery models\n2. Predicting risks and suggesting interventions\n3. Revealing patterns in disease, getting to a diagnosis quickly and advancing therapies…https://t.co/QyvLmmltnF https://t.co/hVfFDJIi44 | 0.999432 |
49 | One of the rare times I look at LinkedIn, I come across this graph. IDK how it's made, but being in the field, I think it's such a good reference. Like, just have a good grasp on Python and how to use basic machine learning techniques and you can land a pretty good paying job. https://t.co/zIxwIO9vmt | 0.999643 |
100 rows × 2 columns
12 talleres: Se calificarán los 10 talleres con las mejores notas (cada uno con un valor del 5.5% de la materia). También se sumarán los 20pts del taller 8.
Con Vader, se puede agregar léxico y su puntuaje al modelo del análisis de sentimiento.
from nltk.sentiment.vader import SentimentIntensityAnalyzer
new_words = {
'chupemonda': 2.0,
'madrugón': -3.4,
}
SIA = SentimentIntensityAnalyzer()
SIA.lexicon.update(new_words)
SIA.lexicon